date:20191012

Re: (was Fair Trade O.S.) 0yZ Varanger Sys

2019-10-12 Thread Ywe Cærlyn

Now also added 0yZ to its title, meaning 0 triune god, and 0 jesus on a cross. 
- I think that is what everybody here wants!


Sent with ProtonMail Secure Email.

‐‐‐ Original Message ‐‐‐
On Wednesday 9. October 2019 kl. 09:52, Ywe Cærlyn  
wrote:

> Ok, I think I have a fully and complete view of operating systems philosophy.
>
> Basically a good O.S. philosophy can be traced back to 1000 A.D. and the 
> Saxons forcing the bible, and "god" on people in Norway. Still Norway never 
> really did not become un-varangian.
> Today contemplating a background on a good O.S. based on my research, I 
> understand that varangian culture was quite good, and building further on it, 
> ridding oneself of the christian trinitarian god, we can also have a good O.S.
>
> Therefore the system is now named Varanger Sys, with Cider as the original 
> drink of Tór. And the EDM culture of the 90s that Norway was largely 
> influental in, I have named Úpp Varanger EDM, that went all the way up to 
> Kygo (and Maren), that represents types already established in the 90s, which 
> I was part of. Very Norwegian, Scandinavian, probably European, and maybe 
> other places. And we fully support EU.
>
> This is the final basis of my operating system philosophy, and cultural 
> aspect. Small changes may come, if it improves things.
>
> Best Greetings,
> Ywe Cærlyn
> Lead & Philosophy
> Varanger Sys
>
> https://www.youtube.com/channel/UCR3gmLVjHS5A702wo4bol_Q

Re: [PATCH bpf] libbpf: fix passing uninitialized bytes to setsockopt

2019-10-12 Thread Alexei Starovoitov

On Sat, Oct 12, 2019 at 9:52 PM Ilya Maximets  wrote:
>
> 'struct xdp_umem_reg' has 4 bytes of padding at the end that makes
> valgrind complain about passing uninitialized stack memory to the
> syscall:
>
>   Syscall param socketcall.setsockopt() points to uninitialised byte(s)
> at 0x4E7AB7E: setsockopt (in /usr/lib64/libc-2.29.so)
> by 0x4BDE035: xsk_umem__create@@LIBBPF_0.0.4 (xsk.c:172)
>   Uninitialised value was created by a stack allocation
> at 0x4BDDEBA: xsk_umem__create@@LIBBPF_0.0.4 (xsk.c:140)
>
> Padding bytes appeared after introducing of a new 'flags' field.
>
> Fixes: 10d30e301732 ("libbpf: add flags to umem config")
> Signed-off-by: Ilya Maximets 

Something is not right with (e|g)mail.
This is 3rd email I got with the same patch.
First one (the one that was applied) was 3 days ago.

[PATCH] writeback: Fix a warning while "make xmldocs"

2019-10-12 Thread Masanari Iida

This patch fix following warning.
./fs/fs-writeback.c:918: warning: Excess function parameter
'nr_pages' description in 'cgroup_writeback_by_id'

Signed-off-by: Masanari Iida 
---
 fs/fs-writeback.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index e88421d9a48d..8461a6322039 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -905,7 +905,7 @@ static void bdi_split_work_to_wbs(struct backing_dev_info 
*bdi,
  * cgroup_writeback_by_id - initiate cgroup writeback from bdi and memcg IDs
  * @bdi_id: target bdi id
  * @memcg_id: target memcg css id
- * @nr_pages: number of pages to write, 0 for best-effort dirty flushing
+ * @nr: number of pages to write, 0 for best-effort dirty flushing
  * @reason: reason why some writeback work initiated
  * @done: target wb_completion
  *
-- 
2.23.0.526.g70bf0b755af4

Re: [PATCH] mt76: mt76x2: disable pcie_aspm by default

2019-10-12 Thread kbuild test robot

Hi Lorenzo,

I love your patch! Yet something to improve:

[auto build test ERROR on wireless-drivers-next/master]
[cannot apply to v5.4-rc2 next-20191011]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Lorenzo-Bianconi/mt76-mt76x2-disable-pcie_aspm-by-default/20191013-093134
base:   
https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next.git 
master
config: x86_64-allyesconfig (attached as .config)
compiler: gcc-7 (Debian 7.4.0-13) 7.4.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>):

>> drivers/net/wireless/mediatek/mt76/mmio.c:7:10: fatal error: 
>> linux/pci-aspm.h: No such file or directory
#include 
 ^~
   compilation terminated.

vim +7 drivers/net/wireless/mediatek/mt76/mmio.c

   > 7  #include 
 8  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

[PATCH] ACPI Documentation: Minor Spelling Fix

2019-10-12 Thread James Pack

Very minor spelling fix in ACPI documentation

Signed-off-by: James Pack 
---
 Documentation/firmware-guide/acpi/namespace.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/firmware-guide/acpi/namespace.rst 
b/Documentation/firmware-guide/acpi/namespace.rst
index 835521baeb89..3eb763d6656d 100644
--- a/Documentation/firmware-guide/acpi/namespace.rst
+++ b/Documentation/firmware-guide/acpi/namespace.rst
@@ -261,7 +261,7 @@ Description Tables contain information used for the 
creation of the
 struct acpi_device objects represented by the given row (xSDT means DSDT
 or SSDT).
 
-The forth column of the above table indicates the 'bus_id' generation
+The fourth column of the above table indicates the 'bus_id' generation
 rule of the struct acpi_device object:
 
_HID:
-- 
2.20.1

Re: [PATCH net-next v3] genetlink: do not parse attributes for families with zero maxattr

2019-10-12 Thread Jakub Kicinski

On Fri, 11 Oct 2019 09:40:09 +0200, Michal Kubecek wrote:
> Commit c10e6cf85e7d ("net: genetlink: push attrbuf allocation and parsing
> to a separate function") moved attribute buffer allocation and attribute
> parsing from genl_family_rcv_msg_doit() into a separate function
> genl_family_rcv_msg_attrs_parse() which, unlike the previous code, calls
> __nlmsg_parse() even if family->maxattr is 0 (i.e. the family does its own
> parsing). The parser error is ignored and does not propagate out of
> genl_family_rcv_msg_attrs_parse() but an error message ("Unknown attribute
> type") is set in extack and if further processing generates no error or
> warning, it stays there and is interpreted as a warning by userspace.
> 
> Dumpit requests are not affected as genl_family_rcv_msg_dumpit() bypasses
> the call of genl_family_rcv_msg_attrs_parse() if family->maxattr is zero.
> Move this logic inside genl_family_rcv_msg_attrs_parse() so that we don't
> have to handle it in each caller.
> 
> v3: put the check inside genl_family_rcv_msg_attrs_parse()
> v2: adjust also argument of genl_family_rcv_msg_attrs_free()
> 
> Fixes: c10e6cf85e7d ("net: genetlink: push attrbuf allocation and parsing to 
> a separate function")
> Signed-off-by: Michal Kubecek 

Acked-by: Jakub Kicinski

Re: [PATCH 3/7] Add a UFFD_SECURE flag to the userfaultfd API.

2019-10-12 Thread Daniel Colascione

On Sat, Oct 12, 2019 at 6:14 PM Andy Lutomirski  wrote:
>
..
>
> > But maybe we can go further: let's separate authentication and
> > authorization, as we do in other LSM hooks. Let's split my
> > inode_init_security_anon into two hooks, inode_init_security_anon and
> > inode_create_anon. We'd define the former to just initialize the file
> > object's security information --- in the SELinux case, figuring out
> > its class and SID --- and define the latter to answer the yes/no
> > question of whether a particular anonymous inode creation should be
> > allowed. Normally, anon_inode_getfile2() would just call both hooks.
> > We'd add another anon_inode_getfd flag, ANON_INODE_SKIP_AUTHORIZATION
> > or something, that would tell anon_inode_getfile2() to skip calling
> > the authorization hook, effectively making the creation always
> > succeed. We can then make the UFFD code pass
> > ANON_INODE_SKIP_AUTHORIZATION when it's creating a file object in the
> > fork child while creating UFFD_EVENT_FORK messages.
>
> That sounds like an improvement.  Or maybe just teach SELinux that
> this particular fd creation is actually making an anon_inode that is a
> child of an existing anon inode and that the context should be copied
> or whatever SELinux wants to do.  Like this, maybe:
>
> static int resolve_userfault_fork(struct userfaultfd_ctx *ctx,
>   struct userfaultfd_ctx *new,
>   struct uffd_msg *msg)
> {
> int fd;
>
> Change this:
>
> fd = anon_inode_getfd("[userfaultfd]", _fops, new,
>   O_RDWR | (new->flags & 
> UFFD_SHARED_FCNTL_FLAGS));
>
> to something like:
>
>   fd = anon_inode_make_child_fd(..., ctx->inode, ...);
>
> where ctx->inode is the one context's inode.

Yeah. I figured we could just add a special-purpose hook for this
case. Having a special hook for this one case feels ugly though, and
at copy_mm time, we don't have a PID for the new child yet --- I don't
know whether LSMs would care about that. But maybe this is one of
those "doctor, it hurts when I do this!" situations and this child
process difficulty is just a hint that some other design might work
better.

> Now that you've pointed this mechanism out, it is utterly and
> completely broken and should be removed from the kernel outright or at
> least severely restricted.  A .read implementation MUST NOT ACT ON THE
> CALLING TASK.  Ever.  Just imagine the effect of passing a userfaultfd
> as stdin to a setuid program.
>
> So I think the right solution might be to attempt to *remove*
> UFFD_EVENT_FORK.  Maybe the solution is to say that, unless the
> creator of a userfaultfd() has global CAP_SYS_ADMIN, then it cannot
> use UFFD_FEATURE_EVENT_FORK) and print a warning (once) when
> UFFD_FEATURE_EVENT_FORK is allowed.  And, after some suitable
> deprecation period, just remove it.  If it's genuinely useful, it
> needs an entirely new API based on ioctl() or a syscall.  Or even
> recvmsg() :)

IMHO, userfaultfd should have been a datagram socket from the start.
As you point out, it's a good fit for the UFFD protocol, which
involves FD passing and a fixed message size.

> And UFFD_SECURE should just become automatic, since you don't have a
> problem any more. :-p

Agreed. I'll wait to hear what everyone else has to say.

Re: [PATCH 3/7] Add a UFFD_SECURE flag to the userfaultfd API.

2019-10-12 Thread Andy Lutomirski

[adding more people because this is going to be an ABI break, sigh]

On Sat, Oct 12, 2019 at 5:52 PM Daniel Colascione  wrote:
>
> On Sat, Oct 12, 2019 at 4:10 PM Andy Lutomirski  wrote:
> >
> > On Sat, Oct 12, 2019 at 12:16 PM Daniel Colascione  
> > wrote:
> > >
> > > The new secure flag makes userfaultfd use a new "secure" anonymous
> > > file object instead of the default one, letting security modules
> > > supervise userfaultfd use.
> > >
> > > Requiring that users pass a new flag lets us avoid changing the
> > > semantics for existing callers.
> >
> > Is there any good reason not to make this be the default?
> >
> >
> > The only downside I can see is that it would increase the memory usage
> > of userfaultfd(), but that doesn't seem like such a big deal.  A
> > lighter-weight alternative would be to have a single inode shared by
> > all userfaultfd instances, which would require a somewhat different
> > internal anon_inode API.
>
> I'd also prefer to just make SELinux use mandatory, but there's a
> nasty interaction with UFFD_EVENT_FORK. Adding a new UFFD_SECURE mode
> which blocks UFFD_EVENT_FORK sidesteps this problem. Maybe you know a
> better way to deal with it.

...

> But maybe we can go further: let's separate authentication and
> authorization, as we do in other LSM hooks. Let's split my
> inode_init_security_anon into two hooks, inode_init_security_anon and
> inode_create_anon. We'd define the former to just initialize the file
> object's security information --- in the SELinux case, figuring out
> its class and SID --- and define the latter to answer the yes/no
> question of whether a particular anonymous inode creation should be
> allowed. Normally, anon_inode_getfile2() would just call both hooks.
> We'd add another anon_inode_getfd flag, ANON_INODE_SKIP_AUTHORIZATION
> or something, that would tell anon_inode_getfile2() to skip calling
> the authorization hook, effectively making the creation always
> succeed. We can then make the UFFD code pass
> ANON_INODE_SKIP_AUTHORIZATION when it's creating a file object in the
> fork child while creating UFFD_EVENT_FORK messages.

That sounds like an improvement.  Or maybe just teach SELinux that
this particular fd creation is actually making an anon_inode that is a
child of an existing anon inode and that the context should be copied
or whatever SELinux wants to do.  Like this, maybe:

static int resolve_userfault_fork(struct userfaultfd_ctx *ctx,
  struct userfaultfd_ctx *new,
  struct uffd_msg *msg)
{
int fd;

Change this:

fd = anon_inode_getfd("[userfaultfd]", _fops, new,
  O_RDWR | (new->flags & UFFD_SHARED_FCNTL_FLAGS));

to something like:

  fd = anon_inode_make_child_fd(..., ctx->inode, ...);

where ctx->inode is the one context's inode.

*** HOWEVER *** !!!

Now that you've pointed this mechanism out, it is utterly and
completely broken and should be removed from the kernel outright or at
least severely restricted.  A .read implementation MUST NOT ACT ON THE
CALLING TASK.  Ever.  Just imagine the effect of passing a userfaultfd
as stdin to a setuid program.

So I think the right solution might be to attempt to *remove*
UFFD_EVENT_FORK.  Maybe the solution is to say that, unless the
creator of a userfaultfd() has global CAP_SYS_ADMIN, then it cannot
use UFFD_FEATURE_EVENT_FORK) and print a warning (once) when
UFFD_FEATURE_EVENT_FORK is allowed.  And, after some suitable
deprecation period, just remove it.  If it's genuinely useful, it
needs an entirely new API based on ioctl() or a syscall.  Or even
recvmsg() :)

And UFFD_SECURE should just become automatic, since you don't have a
problem any more. :-p

--Andy

Re: [PATCH 0/2] media: meson: vdec: Add compliant H264 support

2019-10-12 Thread Nicolas Dufresne

Le lundi 07 octobre 2019 à 16:59 +0200, Maxime Jourdan a écrit :
> Hello,
> 
> This patch series aims to bring H.264 support as well as compliance update
> to the amlogic stateful video decoder driver.
> 
> There is 1 issue that remains currently:
> 
>  - The following codepath had to be commented out from v4l2-compliance as
> it led to stalling:
> 
> if (node->codec_mask & STATEFUL_DECODER) {
>   struct v4l2_decoder_cmd cmd;
>   buffer buf_cap(m2m_q);
> 
>   memset(, 0, sizeof(cmd));
>   cmd.cmd = V4L2_DEC_CMD_STOP;
> 
>   /* No buffers are queued, call STREAMON, then STOP */
>   fail_on_test(node->streamon(q.g_type()));
>   fail_on_test(node->streamon(m2m_q.g_type()));
>   fail_on_test(doioctl(node, VIDIOC_DECODER_CMD, ));
> 
>   fail_on_test(buf_cap.querybuf(node, 0));
>   fail_on_test(buf_cap.qbuf(node));
>   fail_on_test(buf_cap.dqbuf(node));
>   fail_on_test(!(buf_cap.g_flags() & V4L2_BUF_FLAG_LAST));
>   for (unsigned p = 0; p < buf_cap.g_num_planes(); p++)
>   fail_on_test(buf_cap.g_bytesused(p));
>   fail_on_test(node->streamoff(q.g_type()));
>   fail_on_test(node->streamoff(m2m_q.g_type()));
> 
>   /* Call STREAMON, queue one CAPTURE buffer, then STOP */
>   fail_on_test(node->streamon(q.g_type()));
>   fail_on_test(node->streamon(m2m_q.g_type()));
>   fail_on_test(buf_cap.querybuf(node, 0));
>   fail_on_test(buf_cap.qbuf(node));
>   fail_on_test(doioctl(node, VIDIOC_DECODER_CMD, ));
> 
>   fail_on_test(buf_cap.dqbuf(node));
>   fail_on_test(!(buf_cap.g_flags() & V4L2_BUF_FLAG_LAST));
>   for (unsigned p = 0; p < buf_cap.g_num_planes(); p++)
>   fail_on_test(buf_cap.g_bytesused(p));
>   fail_on_test(node->streamoff(q.g_type()));
>   fail_on_test(node->streamoff(m2m_q.g_type()));
> }
> 
> The reason for this is because the driver has a limitation where all
> capturebuffers must be queued to the driver before STREAMON is effective.
> The firmware needs to know in advance what all the buffers are before
> starting to decode.
> This limitation is enforced via q->min_buffers_needed.
> As such, in this compliance codepath, STREAMON is never actually called
> driver-side and there is a stall on fail_on_test(buf_cap.dqbuf(node));
> 
> 
> One last detail: V4L2_FMT_FLAG_DYN_RESOLUTION is currently not recognized
> by v4l2-compliance, so it was left out for the test. However, it is
> present in the patch series.
> 
> The second patch has 3 "Alignment should match open parenthesis" lines
> where I preferred to keep them that way.
> 
> Thanks Stanimir for sharing your HDR file creation tools, this was very
> helpful :).

I tried to test this with a pending branch of GStreamer supporting
dynamic resolution changes. The even driver mechanism does not seem to
work with this driver. I've grepped the code, and don't see any places
were the event would be emitted.

Then I grepped, and it seems the driver accept source_change
subscription but does not set V4L2_FMT_FLAG_DYN_RESOLUTION. I believe
these two things are bit redundant and confusing, I'll fix the proposed
patch never the less, and see if that makes it work.

> 
> Maxime
> 
> # v4l2-compliance --stream-from-hdr test-25fps.h264.hdr -s250
> v4l2-compliance SHA: a162244d47d4bb01d0692da879dce5a070f118e7, 64 bits
> 
> Compliance test for meson-vdec device /dev/video0:
> 
> Driver Info:
>   Driver name  : meson-vdec
>   Card type: Amlogic Video Decoder
>   Bus info : platform:meson-vdec
>   Driver version   : 5.4.0
>   Capabilities : 0x84204000
>   Video Memory-to-Memory Multiplanar
>   Streaming
>   Extended Pix Format
>   Device Capabilities
>   Device Caps  : 0x04204000
>   Video Memory-to-Memory Multiplanar
>   Streaming
>   Extended Pix Format
>   Detected Stateful Decoder
> 
> Required ioctls:
>   test VIDIOC_QUERYCAP: OK
> 
> Allow for multiple opens:
>   test second /dev/video0 open: OK
>   test VIDIOC_QUERYCAP: OK
>   test VIDIOC_G/S_PRIORITY: OK
>   test for unlimited opens: OK
> 
> Debug ioctls:
>   test VIDIOC_DBG_G/S_REGISTER: OK (Not Supported)
>   test VIDIOC_LOG_STATUS: OK (Not Supported)
> 
> Input ioctls:
>   test VIDIOC_G/S_TUNER/ENUM_FREQ_BANDS: OK (Not Supported)
>   test VIDIOC_G/S_FREQUENCY: OK (Not Supported)
>   test VIDIOC_S_HW_FREQ_SEEK: OK (Not Supported)
>   test VIDIOC_ENUMAUDIO: OK (Not Supported)
>   test VIDIOC_G/S/ENUMINPUT: OK (Not Supported)
>   test VIDIOC_G/S_AUDIO: OK (Not Supported)
>   Inputs: 0 Audio Inputs: 0 Tuners: 0
> 
> Output ioctls:
>   test VIDIOC_G/S_MODULATOR: OK (Not Supported)
>   test VIDIOC_G/S_FREQUENCY: OK (Not Supported)
>   test VIDIOC_ENUMAUDOUT: OK (Not Supported)
>   test VIDIOC_G/S/ENUMOUTPUT: OK (Not Supported)
>   test VIDIOC_G/S_AUDOUT: OK (Not Supported)
>

Re: [PATCH 1/2] net: fec_main: Use platform_get_irq_byname_optional() to avoid error message

2019-10-12 Thread Jakub Kicinski

On Fri, 11 Oct 2019 12:55:20 +0300, Vladimir Oltean wrote:
> > > Unfortunately the networking subsystem sees around a 100 patches
> > > submitted each day, it'd be very hard to keep track of patches which have
> > > external dependencies and when to merge them. That's why we need the
> > > submitters to do this work for us and resubmit when the patch can be
> > > applied cleanly.  
> >
> > OK, I will resend this patch series once the necessary patch lands
> > on the network tree.  
> 
> What has not been mentioned is that you can't create future
> dependencies for patches which have a Fixes: tag.
> 
> git describe --tags 7723f4c5ecdb # driver core: platform: Add an error
> message to platform_get_irq*()
> v5.3-rc1-13-g7723f4c5ecdb
> 
> git describe --tags f1da567f1dc # driver core: platform: Add
> platform_get_irq_byname_optional()
> v5.4-rc1-46-gf1da567f1dc1

Ack, you raise some good points. AFAIU tho, in this case broken
patch, the dependency, and the fix are all targeting 5.4, so there
will be no real backporting hassle, while the presence of a Fixes 
tag makes it clear where the regression was introduced.

Re: [PATCH 3/7] Add a UFFD_SECURE flag to the userfaultfd API.

2019-10-12 Thread Daniel Colascione

On Sat, Oct 12, 2019 at 4:10 PM Andy Lutomirski  wrote:
>
> On Sat, Oct 12, 2019 at 12:16 PM Daniel Colascione  wrote:
> >
> > The new secure flag makes userfaultfd use a new "secure" anonymous
> > file object instead of the default one, letting security modules
> > supervise userfaultfd use.
> >
> > Requiring that users pass a new flag lets us avoid changing the
> > semantics for existing callers.
>
> Is there any good reason not to make this be the default?
>
>
> The only downside I can see is that it would increase the memory usage
> of userfaultfd(), but that doesn't seem like such a big deal.  A
> lighter-weight alternative would be to have a single inode shared by
> all userfaultfd instances, which would require a somewhat different
> internal anon_inode API.

I'd also prefer to just make SELinux use mandatory, but there's a
nasty interaction with UFFD_EVENT_FORK. Adding a new UFFD_SECURE mode
which blocks UFFD_EVENT_FORK sidesteps this problem. Maybe you know a
better way to deal with it.

Right now, when a process with a UFFD-managed VMA using
UFFD_EVENT_FORK forks, we make a new userfaultfd_ctx out of thin air
and enqueue it on the message queue for the parent process. When we
dequeue that context, we get to resolve_userfault_fork, which makes up
a new UFFD file object out of thin air in the context of the reading
process. Following normal SELinux rules, the SID attached to that new
file object would be the task SID of the process *reading* the fork
event, not the SID of the new fork child. That seems wrong, because
the label we give to the UFFD should correspond to the label of the
process that UFFD controls.

To try to solve this problem, we can move the file object creation to
the fork child and enqueue the file object itself instead of just the
userfaultfd_ctx, treating the dequeue as a file-descriptor-receive
operation just like a recvmsg of an AF_UNIX socket with SCM_RIGHTS.
(This approach seems more elegant anyway, since it reflects what's
actually going on.) The trouble the early-file-object-creation
approach is that the fork child may not be allowed to create UFFD file
objects on its own and an LSM can't tell the difference between
UFFD_EVENT_FORK handling creating the file object and the fork child
just calling userfaultfd(), meaning an LSM could veto the creation of
the file object for the fork event. We can't just create a
non-ANON_INODE_SECURE file object instead: that would defeat the whole
purpose of supervising UFFD using SELinux.

But maybe we can go further: let's separate authentication and
authorization, as we do in other LSM hooks. Let's split my
inode_init_security_anon into two hooks, inode_init_security_anon and
inode_create_anon. We'd define the former to just initialize the file
object's security information --- in the SELinux case, figuring out
its class and SID --- and define the latter to answer the yes/no
question of whether a particular anonymous inode creation should be
allowed. Normally, anon_inode_getfile2() would just call both hooks.
We'd add another anon_inode_getfd flag, ANON_INODE_SKIP_AUTHORIZATION
or something, that would tell anon_inode_getfile2() to skip calling
the authorization hook, effectively making the creation always
succeed. We can then make the UFFD code pass
ANON_INODE_SKIP_AUTHORIZATION when it's creating a file object in the
fork child while creating UFFD_EVENT_FORK messages.

Granted, UFFD fork processing doesn't actually occur in the fork
child, but in copy_mm, in the parent --- but the right thing should
happen anyway, right?

I'm open to suggestions. In the meantime, I figured we'd just define a
UFFD_SECURE and make it incompatible with UFFD_EVENT_FORK.

> In any event, I don't think that "make me visible to SELinux" should
> be a choice that user code makes.

Right. The new unprivileged_userfaultfd setting is ugly, but it at
least removes the ability of unprivileged users to opt out of SELinux
supervision.

Re: [PATCH 4/7] Teach SELinux about a new userfaultfd class

2019-10-12 Thread Andy Lutomirski

On Sat, Oct 12, 2019 at 5:12 PM Daniel Colascione  wrote:
>
> On Sat, Oct 12, 2019 at 4:09 PM Andy Lutomirski  wrote:
> >
> > On Sat, Oct 12, 2019 at 12:16 PM Daniel Colascione  
> > wrote:
> > >
> > > Use the secure anonymous inode LSM hook we just added to let SELinux
> > > policy place restrictions on userfaultfd use. The create operation
> > > applies to processes creating new instances of these file objects;
> > > transfer between processes is covered by restrictions on read, write,
> > > and ioctl access already checked inside selinux_file_receive.
> >
> > This is great, and I suspect we'll want it for things like SGX, too.
> > But the current design seems like it will make it essentially
> > impossible for SELinux to reference an anon_inode class whose
> > file_operations are in a module, and moving file_operations out of a
> > module would be nasty.
> >
> > Could this instead be keyed off a new struct anon_inode_class, an
> > enum, or even just a string?
>
> The new LSM hook already receives the string that callers pass to the
> anon_inode APIs; modules can look at that instead of the fops if they
> want. The reason to pass both the name and the fops through the hook
> is to allow LSMs to match using fops comparison (which seems less
> prone to breakage) when possible and rely on string matching when it
> isn't.

I suppose that whoever makes the first module that wants to use this
mechanism can have the fun task of reworking it.  There's nothing
user-visible here that would make it hard to change in the future.

Re: [PATCH 1/7 v2] tracefs: Revert ccbd54ff54e8 ("tracefs: Restrict tracefs when the kernel is locked down")

2019-10-12 Thread Steven Rostedt

On Sat, 12 Oct 2019 20:35:02 -0400
Steven Rostedt  wrote:

> On Sat, 12 Oct 2019 15:56:15 -0700
> Linus Torvalds  wrote:
> 
> > On Fri, Oct 11, 2019 at 5:59 PM Steven Rostedt  wrote: 
> >  
> > >
> > >
> > > I bisected this down to the addition of the proxy_ops into tracefs for
> > > lockdown. It appears that the allocation of the proxy_ops and then freeing
> > > it in the destroy_inode callback, is causing havoc with the memory system.
> > > Reading the documentation about destroy_inode and talking with Linus about
> > > this, this is buggy and wrong.
> > 
> > Can you still add the explanation about the inode memory leak to this 
> > message?
> > 
> > Right now it just says "it's buggy and wrong". True. But doesn't
> > explain _why_ it is buggy and wrong.
> >   
> 
> Sure. The patches just finished my testing (along with other fixes that
> I need to send you). I have to make a few other updates in the change
> log though, so I'll be rebasing them (but not touching the code), to
> clean up the change logs.
> 

I updated this change log to state:

"I bisected this down to the addition of the proxy_ops into tracefs for
lockdown. It appears that the allocation of the proxy_ops and then freeing
it in the destroy_inode callback, is causing havoc with the memory system.
Reading the documentation about destroy_inode and talking with Linus about
this, this is buggy and wrong. When defining the destroy_inode() method, it 
is expected that the destroy_inode() will also free the inode, and not just 
the extra allocations done in the creation of the inode. The faulty commit 
causes a memory leak of the inode data structure when they are deleted."

-- Steve

[PATCH] xhci: Don't use soft retry if slot id > 0

2019-10-12 Thread Bernhard Gebetsberger

According to the xhci specification(chapter 4.6.8.1) soft retry
shouldn't be used if the slot id is higher than 0. Currently some usb
devices break on some systems because soft retry is being used when
there is a transaction error, without checking the slot id.

Fixes: f8f80be501aa ("xhci: Use soft retry to recover faster from
transaction errors")

Signed-off-by: Bernhard Gebetsberger 
---
 drivers/usb/host/xhci-ring.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 85ceb43e3405..5fa06189068d 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2270,7 +2270,7 @@ static int process_bulk_intr_td(struct xhci_hcd *xhci, 
struct xhci_td *td,
break;
case COMP_USB_TRANSACTION_ERROR:
if ((ep_ring->err_count++ > MAX_SOFT_RETRY) ||
-   le32_to_cpu(slot_ctx->tt_info) & TT_SLOT)
+   le32_to_cpu(slot_ctx->tt_info) & TT_SLOT || slot_id > 0)
break;
*status = 0;
xhci_cleanup_halted_endpoint(xhci, slot_id, ep_index,
--
2.23.0

Re: [PATCH 1/7 v2] tracefs: Revert ccbd54ff54e8 ("tracefs: Restrict tracefs when the kernel is locked down")

2019-10-12 Thread Steven Rostedt

On Sat, 12 Oct 2019 15:56:15 -0700
Linus Torvalds  wrote:

> On Fri, Oct 11, 2019 at 5:59 PM Steven Rostedt  wrote:
> >
> >
> > I bisected this down to the addition of the proxy_ops into tracefs for
> > lockdown. It appears that the allocation of the proxy_ops and then freeing
> > it in the destroy_inode callback, is causing havoc with the memory system.
> > Reading the documentation about destroy_inode and talking with Linus about
> > this, this is buggy and wrong.  
> 
> Can you still add the explanation about the inode memory leak to this message?
> 
> Right now it just says "it's buggy and wrong". True. But doesn't
> explain _why_ it is buggy and wrong.
> 

Sure. The patches just finished my testing (along with other fixes that
I need to send you). I have to make a few other updates in the change
log though, so I'll be rebasing them (but not touching the code), to
clean up the change logs.

-- Steve

[PATCH v6 9/9] hugetlb_cgroup: Add hugetlb_cgroup reservation docs

2019-10-12 Thread Mina Almasry

Add docs for how to use hugetlb_cgroup reservations, and their behavior.

Signed-off-by: Mina Almasry 
Acked-by: Hillf Danton 

---

Changes in v6:
- Updated docs to reflect the new design based on a new counter that
tracks both reservations and faults.

---
 .../admin-guide/cgroup-v1/hugetlb.rst | 64 +++
 1 file changed, 53 insertions(+), 11 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v1/hugetlb.rst 
b/Documentation/admin-guide/cgroup-v1/hugetlb.rst
index a3902aa253a96..efb94e4db9d5a 100644
--- a/Documentation/admin-guide/cgroup-v1/hugetlb.rst
+++ b/Documentation/admin-guide/cgroup-v1/hugetlb.rst
@@ -2,13 +2,6 @@
 HugeTLB Controller
 ==

-The HugeTLB controller allows to limit the HugeTLB usage per control group and
-enforces the controller limit during page fault. Since HugeTLB doesn't
-support page reclaim, enforcing the limit at page fault time implies that,
-the application will get SIGBUS signal if it tries to access HugeTLB pages
-beyond its limit. This requires the application to know beforehand how much
-HugeTLB pages it would require for its use.
-
 HugeTLB controller can be created by first mounting the cgroup filesystem.

 # mount -t cgroup -o hugetlb none /sys/fs/cgroup
@@ -28,10 +21,14 @@ process (bash) into it.

 Brief summary of control files::

- hugetlb..limit_in_bytes # set/show limit of "hugepagesize" 
hugetlb usage
- hugetlb..max_usage_in_bytes # show max "hugepagesize" hugetlb  
usage recorded
- hugetlb..usage_in_bytes # show current usage for 
"hugepagesize" hugetlb
- hugetlb..failcnt   # show the number of 
allocation failure due to HugeTLB limit
+ hugetlb..reservation_limit_in_bytes # set/show limit of 
"hugepagesize" hugetlb reservations
+ hugetlb..reservation_max_usage_in_bytes # show max 
"hugepagesize" hugetlb reservations and no-reserve faults.
+ hugetlb..reservation_usage_in_bytes # show current 
reservations and no-reserve faults for "hugepagesize" hugetlb
+ hugetlb..reservation_failcnt# show the number of 
allocation failure due to HugeTLB reservation limit
+ hugetlb..limit_in_bytes # set/show limit of 
"hugepagesize" hugetlb faults
+ hugetlb..max_usage_in_bytes # show max 
"hugepagesize" hugetlb  usage recorded
+ hugetlb..usage_in_bytes # show current usage 
for "hugepagesize" hugetlb
+ hugetlb..failcnt# show the number of 
allocation failure due to HugeTLB usage limit

 For a system supporting three hugepage sizes (64k, 32M and 1G), the control
 files include::
@@ -40,11 +37,56 @@ files include::
   hugetlb.1GB.max_usage_in_bytes
   hugetlb.1GB.usage_in_bytes
   hugetlb.1GB.failcnt
+  hugetlb.1GB.reservation_limit_in_bytes
+  hugetlb.1GB.reservation_max_usage_in_bytes
+  hugetlb.1GB.reservation_usage_in_bytes
+  hugetlb.1GB.reservation_failcnt
   hugetlb.64KB.limit_in_bytes
   hugetlb.64KB.max_usage_in_bytes
   hugetlb.64KB.usage_in_bytes
   hugetlb.64KB.failcnt
+  hugetlb.64KB.reservation_limit_in_bytes
+  hugetlb.64KB.reservation_max_usage_in_bytes
+  hugetlb.64KB.reservation_usage_in_bytes
+  hugetlb.64KB.reservation_failcnt
   hugetlb.32MB.limit_in_bytes
   hugetlb.32MB.max_usage_in_bytes
   hugetlb.32MB.usage_in_bytes
   hugetlb.32MB.failcnt
+  hugetlb.32MB.reservation_limit_in_bytes
+  hugetlb.32MB.reservation_max_usage_in_bytes
+  hugetlb.32MB.reservation_usage_in_bytes
+  hugetlb.32MB.reservation_failcnt
+
+
+1. Reservation limits
+
+The HugeTLB controller allows to limit the HugeTLB reservations per control
+group and enforces the controller limit at reservation time and at the fault of
+hugetlb memory for which no reservation exists. Reservation limits
+are superior to Page fault limits (see section 2), since Reservation limits are
+enforced at reservation time (on mmap or shget), and never causes the
+application to get SIGBUS signal if the memory was reserved before hand. For
+MAP_NORESERVE allocations, the reservation limit behaves the same as the fault
+limit, enforcing memory usage at fault time and causing the application to
+receive a SIGBUS if it's crossing its limit.
+
+2. Page fault limits
+
+The HugeTLB controller allows to limit the HugeTLB usage (page fault) per
+control group and enforces the controller limit during page fault. Since 
HugeTLB
+doesn't support page reclaim, enforcing the limit at page fault time implies
+that, the application will get SIGBUS signal if it tries to access HugeTLB
+pages beyond its limit. This requires the application to know beforehand how
+much HugeTLB pages it would require for its use.
+
+
+3. Caveats with shared memory
+
+For shared hugetlb memory, both hugetlb reservation and page faults are charged
+to the first task that causes the memory to be reserved or faulted, and all
+subsequent uses of this reserved or faulted memory is done without charging.
+
+Shared hugetlb memory is only uncharged when it is unreserved or deallocated.
+This is usually

[PATCH] net: core: skbuff: skb_checksum_setup() drop err

2019-10-12 Thread Vito Caputo

Return directly from all switch cases, no point in storing in err.

Signed-off-by: Vito Caputo 
---
 net/core/skbuff.c | 15 +++
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index f5f904f46893..c59b68a413b5 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4888,23 +4888,14 @@ static int skb_checksum_setup_ipv6(struct sk_buff *skb, 
bool recalculate)
  */
 int skb_checksum_setup(struct sk_buff *skb, bool recalculate)
 {
-   int err;
-
switch (skb->protocol) {
case htons(ETH_P_IP):
-   err = skb_checksum_setup_ipv4(skb, recalculate);
-   break;
-
+   return skb_checksum_setup_ipv4(skb, recalculate);
case htons(ETH_P_IPV6):
-   err = skb_checksum_setup_ipv6(skb, recalculate);
-   break;
-
+   return skb_checksum_setup_ipv6(skb, recalculate);
default:
-   err = -EPROTO;
-   break;
+   return -EPROTO;
}
-
-   return err;
 }
 EXPORT_SYMBOL(skb_checksum_setup);
 
-- 
2.11.0

[PATCH v6 8/9] hugetlb_cgroup: Add hugetlb_cgroup reservation tests

2019-10-12 Thread Mina Almasry

The tests use both shared and private mapped hugetlb memory, and
monitors the hugetlb usage counter as well as the hugetlb reservation
counter. They test different configurations such as hugetlb memory usage
via hugetlbfs, or MAP_HUGETLB, or shmget/shmat, and with and without
MAP_POPULATE.

Signed-off-by: Mina Almasry 

---

Changes in v6:
- Updates tests for cgroups-v2 and NORESERVE allocations.

---
 tools/testing/selftests/vm/.gitignore |   1 +
 tools/testing/selftests/vm/Makefile   |   1 +
 .../selftests/vm/charge_reserved_hugetlb.sh   | 527 ++
 .../selftests/vm/write_hugetlb_memory.sh  |  23 +
 .../testing/selftests/vm/write_to_hugetlbfs.c | 261 +
 5 files changed, 813 insertions(+)
 create mode 100755 tools/testing/selftests/vm/charge_reserved_hugetlb.sh
 create mode 100644 tools/testing/selftests/vm/write_hugetlb_memory.sh
 create mode 100644 tools/testing/selftests/vm/write_to_hugetlbfs.c

diff --git a/tools/testing/selftests/vm/.gitignore 
b/tools/testing/selftests/vm/.gitignore
index 31b3c98b6d34d..d3bed9407773c 100644
--- a/tools/testing/selftests/vm/.gitignore
+++ b/tools/testing/selftests/vm/.gitignore
@@ -14,3 +14,4 @@ virtual_address_range
 gup_benchmark
 va_128TBswitch
 map_fixed_noreplace
+write_to_hugetlbfs
diff --git a/tools/testing/selftests/vm/Makefile 
b/tools/testing/selftests/vm/Makefile
index 9534dc2bc9295..31c2cc5cf30b5 100644
--- a/tools/testing/selftests/vm/Makefile
+++ b/tools/testing/selftests/vm/Makefile
@@ -18,6 +18,7 @@ TEST_GEN_FILES += transhuge-stress
 TEST_GEN_FILES += userfaultfd
 TEST_GEN_FILES += va_128TBswitch
 TEST_GEN_FILES += virtual_address_range
+TEST_GEN_FILES += write_to_hugetlbfs

 TEST_PROGS := run_vmtests

diff --git a/tools/testing/selftests/vm/charge_reserved_hugetlb.sh 
b/tools/testing/selftests/vm/charge_reserved_hugetlb.sh
new file mode 100755
index 0..278dd6475cd0f
--- /dev/null
+++ b/tools/testing/selftests/vm/charge_reserved_hugetlb.sh
@@ -0,0 +1,527 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+
+set -e
+
+if [[ $(id -u) -ne 0 ]]; then
+   echo "This test must be run as root. Skipping..."
+   exit 0
+fi
+
+cgroup_path=/dev/cgroup/memory
+if [[ ! -e $cgroup_path ]]; then
+  mkdir -p $cgroup_path
+  mount -t cgroup2 none $cgroup_path
+fi
+
+echo "+hugetlb" > /dev/cgroup/memory/cgroup.subtree_control
+
+
+cleanup () {
+   echo $$ > $cgroup_path/cgroup.procs
+
+   if [[ -e /mnt/huge ]]; then
+ rm -rf /mnt/huge/*
+ umount /mnt/huge || echo error
+ rmdir /mnt/huge
+   fi
+   if [[ -e $cgroup_path/hugetlb_cgroup_test ]]; then
+ rmdir $cgroup_path/hugetlb_cgroup_test
+   fi
+   if [[ -e $cgroup_path/hugetlb_cgroup_test1 ]]; then
+ rmdir $cgroup_path/hugetlb_cgroup_test1
+   fi
+   if [[ -e $cgroup_path/hugetlb_cgroup_test2 ]]; then
+ rmdir $cgroup_path/hugetlb_cgroup_test2
+   fi
+   echo 0 > /proc/sys/vm/nr_hugepages
+   echo CLEANUP DONE
+}
+
+function expect_equal() {
+  local expected="$1"
+  local actual="$2"
+  local error="$3"
+
+  if [[ "$expected" != "$actual" ]]; then
+   echo "expected ($expected) != actual ($actual): $3"
+   cleanup
+   exit 1
+  fi
+}
+
+function setup_cgroup() {
+  local name="$1"
+  local cgroup_limit="$2"
+  local reservation_limit="$3"
+
+  mkdir $cgroup_path/$name
+
+  echo writing cgroup limit: "$cgroup_limit"
+  echo "$cgroup_limit" > $cgroup_path/$name/hugetlb.2MB.limit_in_bytes
+
+  echo writing reseravation limit: "$reservation_limit"
+  echo "$reservation_limit" > \
+   $cgroup_path/$name/hugetlb.2MB.reservation_limit_in_bytes
+
+  if [ -e "$cgroup_path/$name/cpuset.cpus" ]; then
+echo 0 > $cgroup_path/$name/cpuset.cpus
+  fi
+  if [ -e "$cgroup_path/$name/cpuset.mems" ]; then
+echo 0 > $cgroup_path/$name/cpuset.mems
+  fi
+}
+
+function wait_for_hugetlb_memory_to_get_depleted {
+   local cgroup="$1"
+   local 
path="/dev/cgroup/memory/$cgroup/hugetlb.2MB.reservation_usage_in_bytes"
+   # Wait for hugetlbfs memory to get depleted.
+   while [ $(cat $path) != 0 ]; do
+  echo Waiting for hugetlb memory to get depleted.
+  cat $path
+  sleep 0.5
+   done
+}
+
+function wait_for_hugetlb_memory_to_get_reserved {
+   local cgroup="$1"
+   local size="$2"
+
+   local 
path="/dev/cgroup/memory/$cgroup/hugetlb.2MB.reservation_usage_in_bytes"
+   # Wait for hugetlbfs memory to get written.
+   while [ $(cat $path) != $size ]; do
+  echo Waiting for hugetlb memory to reach size $size.
+  cat $path
+  sleep 0.5
+   done
+}
+
+function wait_for_hugetlb_memory_to_get_written {
+   local cgroup="$1"
+   local size="$2"
+
+   local path="/dev/cgroup/memory/$cgroup/hugetlb.2MB.usage_in_bytes"
+   # Wait for hugetlbfs memory to get written.
+   while [ $(cat $path) != $size ]; do
+  echo Waiting for hugetlb

[PATCH v6 6/9] hugetlb_cgroup: add accounting for shared mappings

2019-10-12 Thread Mina Almasry

For shared mappings, the pointer to the hugetlb_cgroup to uncharge lives
in the resv_map entries, in file_region->reservation_counter.

After a call to region_chg, we charge the approprate hugetlb_cgroup, and if
successful, we pass on the hugetlb_cgroup info to a follow up region_add call.
When a file_region entry is added to the resv_map via region_add, we put the
pointer to that cgroup in file_region->reservation_counter. If charging doesn't
succeed, we report the error to the caller, so that the kernel fails the
reservation.

On region_del, which is when the hugetlb memory is unreserved, we also uncharge
the file_region->reservation_counter.

Signed-off-by: Mina Almasry 

---
 mm/hugetlb.c | 147 ---
 1 file changed, 116 insertions(+), 31 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f9c1947925bb9..af336bf227fb6 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -242,6 +242,15 @@ struct file_region {
struct list_head link;
long from;
long to;
+#ifdef CONFIG_CGROUP_HUGETLB
+   /*
+* On shared mappings, each reserved region appears as a struct
+* file_region in resv_map. These fields hold the info needed to
+* uncharge each reservation.
+*/
+   struct page_counter *reservation_counter;
+   unsigned long pages_per_hpage;
+#endif
 };

 /* Helper that removes a struct file_region from the resv_map cache and returns
@@ -250,12 +259,30 @@ struct file_region {
 static struct file_region *
 get_file_region_entry_from_cache(struct resv_map *resv, long from, long to);

+/* Helper that records hugetlb_cgroup uncharge info. */
+static void record_hugetlb_cgroup_uncharge_info(struct hugetlb_cgroup *h_cg,
+   struct file_region *nrg,
+   struct hstate *h)
+{
+#ifdef CONFIG_CGROUP_HUGETLB
+   if (h_cg) {
+   nrg->reservation_counter =
+   _cg->reserved_hugepage[hstate_index(h)];
+   nrg->pages_per_hpage = pages_per_huge_page(h);
+   } else {
+   nrg->reservation_counter = NULL;
+   nrg->pages_per_hpage = 0;
+   }
+#endif
+}
+
 /* Must be called with resv->lock held. Calling this with count_only == true
  * will count the number of pages to be added but will not modify the linked
  * list.
  */
 static long add_reservation_in_range(struct resv_map *resv, long f, long t,
-bool count_only)
+struct hugetlb_cgroup *h_cg,
+struct hstate *h, bool count_only)
 {
long add = 0;
struct list_head *head = >regions;
@@ -291,6 +318,8 @@ static long add_reservation_in_range(struct resv_map *resv, 
long f, long t,
if (!count_only) {
nrg = get_file_region_entry_from_cache(
resv, last_accounted_offset, rg->from);
+   record_hugetlb_cgroup_uncharge_info(h_cg, nrg,
+   h);
list_add(>link, rg->link.prev);
}
}
@@ -306,11 +335,13 @@ static long add_reservation_in_range(struct resv_map 
*resv, long f, long t,
if (!count_only) {
nrg = get_file_region_entry_from_cache(
resv, last_accounted_offset, t);
+   record_hugetlb_cgroup_uncharge_info(h_cg, nrg, h);
list_add(>link, rg->link.prev);
}
last_accounted_offset = t;
}

+   VM_BUG_ON(add < 0);
return add;
 }

@@ -327,7 +358,8 @@ static long add_reservation_in_range(struct resv_map *resv, 
long f, long t,
  * Return the number of new huge pages added to the map.  This
  * number is greater than or equal to zero.
  */
-static long region_add(struct resv_map *resv, long f, long t,
+static long region_add(struct hstate *h, struct hugetlb_cgroup *h_cg,
+  struct resv_map *resv, long f, long t,
   long regions_needed)
 {
long add = 0;
@@ -336,7 +368,7 @@ static long region_add(struct resv_map *resv, long f, long 
t,

VM_BUG_ON(resv->region_cache_count < regions_needed);

-   add = add_reservation_in_range(resv, f, t, false);
+   add = add_reservation_in_range(resv, f, t, h_cg, h, false);
resv->adds_in_progress -= regions_needed;

spin_unlock(>lock);
@@ -398,7 +430,7 @@ static long region_chg(struct resv_map *resv, long f, long 
t,
}

/* Count how many hugepages in this range are NOT respresented. */
-   chg = add_reservation_in_range(resv, f, t, true);
+   chg = add_reservation_in_range(resv, f, t, NULL, NULL, true);

spin_unlock(>lock);
return chg;
@@

[PATCH v6 7/9] hugetlb_cgroup: support noreserve mappings

2019-10-12 Thread Mina Almasry

Support MAP_NORESERVE accounting as part of the new counter.

For each hugepage allocation, at allocation time we check if there is
a reservation for this allocation or not. If there is a reservation for
this allocation, then this allocation was charged at reservation time,
and we don't re-account it. If there is no reserevation for this
allocation, we charge the appropriate hugetlb_cgroup.

The hugetlb_cgroup to uncharge for this allocation is stored in
page[3].private. We use new APIs added in an earlier patch to set this
pointer.

---
 mm/hugetlb.c | 25 -
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index af336bf227fb6..79b99878ce6f9 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1217,6 +1217,7 @@ static void update_and_free_page(struct hstate *h, struct 
page *page)
1 << PG_writeback);
}
VM_BUG_ON_PAGE(hugetlb_cgroup_from_page(page, false), page);
+   VM_BUG_ON_PAGE(hugetlb_cgroup_from_page(page, true), page);
set_compound_page_dtor(page, NULL_COMPOUND_DTOR);
set_page_refcounted(page);
if (hstate_is_gigantic(h)) {
@@ -1328,6 +1329,9 @@ void free_huge_page(struct page *page)
clear_page_huge_active(page);
hugetlb_cgroup_uncharge_page(hstate_index(h), pages_per_huge_page(h),
 page, false);
+   hugetlb_cgroup_uncharge_page(hstate_index(h), pages_per_huge_page(h),
+page, true);
+
if (restore_reserve)
h->resv_huge_pages++;

@@ -1354,6 +1358,7 @@ static void prep_new_huge_page(struct hstate *h, struct 
page *page, int nid)
set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
spin_lock(_lock);
set_hugetlb_cgroup(page, NULL, false);
+   set_hugetlb_cgroup(page, NULL, true);
h->nr_huge_pages++;
h->nr_huge_pages_node[nid]++;
spin_unlock(_lock);
@@ -2155,10 +2160,19 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
gbl_chg = 1;
}

+   /* If this allocation is not consuming a reservation, charge it now.
+*/
+   if (map_chg || avoid_reserve || !vma_resv_map(vma)) {
+   ret = hugetlb_cgroup_charge_cgroup(idx, pages_per_huge_page(h),
+  _cg, true);
+   if (ret)
+   goto out_subpool_put;
+   }
+
ret = hugetlb_cgroup_charge_cgroup(idx, pages_per_huge_page(h), _cg,
   false);
if (ret)
-   goto out_subpool_put;
+   goto out_uncharge_cgroup_reservation;

spin_lock(_lock);
/*
@@ -2182,6 +2196,11 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
}
hugetlb_cgroup_commit_charge(idx, pages_per_huge_page(h), h_cg, page,
 false);
+   if (!vma_resv_map(vma) || map_chg || avoid_reserve) {
+   hugetlb_cgroup_commit_charge(idx, pages_per_huge_page(h), h_cg,
+page, true);
+   }
+
spin_unlock(_lock);

set_page_private(page, (unsigned long)spool);
@@ -2207,6 +2226,10 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
 out_uncharge_cgroup:
hugetlb_cgroup_uncharge_cgroup(idx, pages_per_huge_page(h), h_cg,
   false);
+out_uncharge_cgroup_reservation:
+   if (map_chg || avoid_reserve || !vma_resv_map(vma))
+   hugetlb_cgroup_uncharge_cgroup(idx, pages_per_huge_page(h),
+  h_cg, true);
 out_subpool_put:
if (map_chg || avoid_reserve)
hugepage_subpool_put_pages(spool, 1);
--
2.23.0.700.g56cf767bdb-goog

[PATCH v6 4/9] hugetlb_cgroup: add reservation accounting for private mappings

2019-10-12 Thread Mina Almasry

Normally the pointer to the cgroup to uncharge hangs off the struct
page, and gets queried when it's time to free the page. With
hugetlb_cgroup reservations, this is not possible. Because it's possible
for a page to be reserved by one task and actually faulted in by another
task.

The best place to put the hugetlb_cgroup pointer to uncharge for
reservations is in the resv_map. But, because the resv_map has different
semantics for private and shared mappings, the code patch to
charge/uncharge shared and private mappings is different. This patch
implements charging and uncharging for private mappings.

For private mappings, the counter to uncharge is in
resv_map->reservation_counter. On initializing the resv_map this is set
to NULL. On reservation of a region in private mapping, the tasks
hugetlb_cgroup is charged and the hugetlb_cgroup is placed is
resv_map->reservation_counter.

On hugetlb_vm_op_close, we uncharge resv_map->reservation_counter.

Signed-off-by: Mina Almasry 
Acked-by: Hillf Danton 

---
 include/linux/hugetlb.h|  8 +++
 include/linux/hugetlb_cgroup.h | 11 +
 mm/hugetlb.c   | 44 +-
 mm/hugetlb_cgroup.c| 12 --
 4 files changed, 62 insertions(+), 13 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 9c49a0ba894d3..36dcda7be4b0e 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -46,6 +46,14 @@ struct resv_map {
long adds_in_progress;
struct list_head region_cache;
long region_cache_count;
+#ifdef CONFIG_CGROUP_HUGETLB
+   /*
+* On private mappings, the counter to uncharge reservations is stored
+* here. If these fields are 0, then the mapping is shared.
+*/
+   struct page_counter *reservation_counter;
+   unsigned long pages_per_hpage;
+#endif
 };
 extern struct resv_map *resv_map_alloc(void);
 void resv_map_release(struct kref *ref);
diff --git a/include/linux/hugetlb_cgroup.h b/include/linux/hugetlb_cgroup.h
index 1bb58a63af586..f6e3d74a02536 100644
--- a/include/linux/hugetlb_cgroup.h
+++ b/include/linux/hugetlb_cgroup.h
@@ -25,6 +25,17 @@ struct hugetlb_cgroup;
 #define HUGETLB_CGROUP_MIN_ORDER 3

 #ifdef CONFIG_CGROUP_HUGETLB
+struct hugetlb_cgroup {
+   struct cgroup_subsys_state css;
+   /*
+* the counter to account for hugepages from hugetlb.
+*/
+   struct page_counter hugepage[HUGE_MAX_HSTATE];
+   /*
+* the counter to account for hugepage reservations from hugetlb.
+*/
+   struct page_counter reserved_hugepage[HUGE_MAX_HSTATE];
+};

 static inline struct hugetlb_cgroup *hugetlb_cgroup_from_page(struct page 
*page,
  bool reserved)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 324859170463b..4a60d7d44b4c3 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -665,6 +665,16 @@ struct resv_map *resv_map_alloc(void)
INIT_LIST_HEAD(_map->regions);

resv_map->adds_in_progress = 0;
+#ifdef CONFIG_CGROUP_HUGETLB
+   /*
+* Initialize these to 0. On shared mappings, 0's here indicate these
+* fields don't do cgroup accounting. On private mappings, these will be
+* re-initialized to the proper values, to indicate that hugetlb cgroup
+* reservations are to be un-charged from here.
+*/
+   resv_map->reservation_counter = NULL;
+   resv_map->pages_per_hpage = 0;
+#endif

INIT_LIST_HEAD(_map->region_cache);
list_add(>link, _map->region_cache);
@@ -3217,7 +3227,18 @@ static void hugetlb_vm_op_close(struct vm_area_struct 
*vma)

reserve = (end - start) - region_count(resv, start, end);

-   kref_put(>refs, resv_map_release);
+#ifdef CONFIG_CGROUP_HUGETLB
+   /*
+* Since we check for HPAGE_RESV_OWNER above, this must a private
+* mapping, and these values should be none-zero, and should point to
+* the hugetlb_cgroup counter to uncharge for this reservation.
+*/
+   WARN_ON(!resv->reservation_counter);
+   WARN_ON(!resv->pages_per_hpage);
+
+   hugetlb_cgroup_uncharge_counter(resv->reservation_counter,
+   (end - start) * resv->pages_per_hpage);
+#endif

if (reserve) {
/*
@@ -3227,6 +3248,8 @@ static void hugetlb_vm_op_close(struct vm_area_struct 
*vma)
gbl_reserve = hugepage_subpool_put_pages(spool, reserve);
hugetlb_acct_memory(h, -gbl_reserve);
}
+
+   kref_put(>refs, resv_map_release);
 }

 static int hugetlb_vm_op_split(struct vm_area_struct *vma, unsigned long addr)
@@ -4560,6 +4583,7 @@ int hugetlb_reserve_pages(struct inode *inode,
struct hstate *h = hstate_inode(inode);
struct hugepage_subpool *spool = subpool_inode(inode);
struct resv_map *resv_map;
+   struct hugetlb_cgroup *h_cg;
long gbl_reserve;

[PATCH net-next v2] hv_sock: use HV_HYP_PAGE_SIZE for Hyper-V communication

2019-10-12 Thread Michael Kelley

From: Himadri Pandya 

Current code assumes PAGE_SIZE (the guest page size) is equal
to the page size used to communicate with Hyper-V (which is
always 4K). While this assumption is true on x86, it may not
be true for Hyper-V on other architectures. For example,
Linux on ARM64 may have PAGE_SIZE of 16K or 64K. A new symbol,
HV_HYP_PAGE_SIZE, has been previously introduced to use when
the Hyper-V page size is intended instead of the guest page size.

Make this code work on non-x86 architectures by using the new
HV_HYP_PAGE_SIZE symbol instead of PAGE_SIZE, where appropriate.
Also replace the now redundant PAGE_SIZE_4K with HV_HYP_PAGE_SIZE.
The change has no effect on x86, but lays the groundwork to run
on ARM64 and others.

Signed-off-by: Himadri Pandya 
Reviewed-by: Michael Kelley 
---

Changes in v2:
* Revised commit message and subject [Jakub Kicinski]

---
 net/vmw_vsock/hyperv_transport.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
index 261521d..d2929ea 100644
--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -13,15 +13,16 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Older (VMBUS version 'VERSION_WIN10' or before) Windows hosts have some
- * stricter requirements on the hv_sock ring buffer size of six 4K pages. Newer
- * hosts don't have this limitation; but, keep the defaults the same for 
compat.
+ * stricter requirements on the hv_sock ring buffer size of six 4K pages.
+ * hyperv-tlfs defines HV_HYP_PAGE_SIZE as 4K. Newer hosts don't have this
+ * limitation; but, keep the defaults the same for compat.
  */
-#define PAGE_SIZE_4K   4096
-#define RINGBUFFER_HVS_RCV_SIZE (PAGE_SIZE_4K * 6)
-#define RINGBUFFER_HVS_SND_SIZE (PAGE_SIZE_4K * 6)
-#define RINGBUFFER_HVS_MAX_SIZE (PAGE_SIZE_4K * 64)
+#define RINGBUFFER_HVS_RCV_SIZE (HV_HYP_PAGE_SIZE * 6)
+#define RINGBUFFER_HVS_SND_SIZE (HV_HYP_PAGE_SIZE * 6)
+#define RINGBUFFER_HVS_MAX_SIZE (HV_HYP_PAGE_SIZE * 64)
 
 /* The MTU is 16KB per the host side's design */
 #define HVS_MTU_SIZE   (1024 * 16)
@@ -54,7 +55,8 @@ struct hvs_recv_buf {
  * ringbuffer APIs that allow us to directly copy data from userspace buffer
  * to VMBus ringbuffer.
  */
-#define HVS_SEND_BUF_SIZE (PAGE_SIZE_4K - sizeof(struct vmpipe_proto_header))
+#define HVS_SEND_BUF_SIZE \
+   (HV_HYP_PAGE_SIZE - sizeof(struct vmpipe_proto_header))
 
 struct hvs_send_buf {
/* The header before the payload data */
@@ -393,10 +395,10 @@ static void hvs_open_connection(struct vmbus_channel 
*chan)
} else {
sndbuf = max_t(int, sk->sk_sndbuf, RINGBUFFER_HVS_SND_SIZE);
sndbuf = min_t(int, sndbuf, RINGBUFFER_HVS_MAX_SIZE);
-   sndbuf = ALIGN(sndbuf, PAGE_SIZE);
+   sndbuf = ALIGN(sndbuf, HV_HYP_PAGE_SIZE);
rcvbuf = max_t(int, sk->sk_rcvbuf, RINGBUFFER_HVS_RCV_SIZE);
rcvbuf = min_t(int, rcvbuf, RINGBUFFER_HVS_MAX_SIZE);
-   rcvbuf = ALIGN(rcvbuf, PAGE_SIZE);
+   rcvbuf = ALIGN(rcvbuf, HV_HYP_PAGE_SIZE);
}
 
ret = vmbus_open(chan, sndbuf, rcvbuf, NULL, 0, hvs_channel_cb,
@@ -670,7 +672,7 @@ static ssize_t hvs_stream_enqueue(struct vsock_sock *vsk, 
struct msghdr *msg,
ssize_t ret = 0;
ssize_t bytes_written = 0;
 
-   BUILD_BUG_ON(sizeof(*send_buf) != PAGE_SIZE_4K);
+   BUILD_BUG_ON(sizeof(*send_buf) != HV_HYP_PAGE_SIZE);
 
send_buf = kmalloc(sizeof(*send_buf), GFP_KERNEL);
if (!send_buf)
-- 
1.8.3.1

[PATCH v6 2/9] hugetlb_cgroup: add interface for charge/uncharge hugetlb reservations

2019-10-12 Thread Mina Almasry

Augments hugetlb_cgroup_charge_cgroup to be able to charge hugetlb
usage or hugetlb reservation counter.

Adds a new interface to uncharge a hugetlb_cgroup counter via
hugetlb_cgroup_uncharge_counter.

Integrates the counter with hugetlb_cgroup, via hugetlb_cgroup_init,
hugetlb_cgroup_have_usage, and hugetlb_cgroup_css_offline.

Signed-off-by: Mina Almasry 

---
 include/linux/hugetlb_cgroup.h |  67 +-
 mm/hugetlb.c   |  17 +++---
 mm/hugetlb_cgroup.c| 100 +
 3 files changed, 130 insertions(+), 54 deletions(-)

diff --git a/include/linux/hugetlb_cgroup.h b/include/linux/hugetlb_cgroup.h
index 063962f6dfc6a..1bb58a63af586 100644
--- a/include/linux/hugetlb_cgroup.h
+++ b/include/linux/hugetlb_cgroup.h
@@ -22,27 +22,35 @@ struct hugetlb_cgroup;
  * Minimum page order trackable by hugetlb cgroup.
  * At least 3 pages are necessary for all the tracking information.
  */
-#define HUGETLB_CGROUP_MIN_ORDER   2
+#define HUGETLB_CGROUP_MIN_ORDER 3

 #ifdef CONFIG_CGROUP_HUGETLB

-static inline struct hugetlb_cgroup *hugetlb_cgroup_from_page(struct page 
*page)
+static inline struct hugetlb_cgroup *hugetlb_cgroup_from_page(struct page 
*page,
+ bool reserved)
 {
VM_BUG_ON_PAGE(!PageHuge(page), page);

if (compound_order(page) < HUGETLB_CGROUP_MIN_ORDER)
return NULL;
-   return (struct hugetlb_cgroup *)page[2].private;
+   if (reserved)
+   return (struct hugetlb_cgroup *)page[3].private;
+   else
+   return (struct hugetlb_cgroup *)page[2].private;
 }

-static inline
-int set_hugetlb_cgroup(struct page *page, struct hugetlb_cgroup *h_cg)
+static inline int set_hugetlb_cgroup(struct page *page,
+struct hugetlb_cgroup *h_cg,
+bool reservation)
 {
VM_BUG_ON_PAGE(!PageHuge(page), page);

if (compound_order(page) < HUGETLB_CGROUP_MIN_ORDER)
return -1;
-   page[2].private = (unsigned long)h_cg;
+   if (reservation)
+   page[3].private = (unsigned long)h_cg;
+   else
+   page[2].private = (unsigned long)h_cg;
return 0;
 }

@@ -52,26 +60,33 @@ static inline bool hugetlb_cgroup_disabled(void)
 }

 extern int hugetlb_cgroup_charge_cgroup(int idx, unsigned long nr_pages,
-   struct hugetlb_cgroup **ptr);
+   struct hugetlb_cgroup **ptr,
+   bool reserved);
 extern void hugetlb_cgroup_commit_charge(int idx, unsigned long nr_pages,
 struct hugetlb_cgroup *h_cg,
-struct page *page);
+struct page *page, bool reserved);
 extern void hugetlb_cgroup_uncharge_page(int idx, unsigned long nr_pages,
-struct page *page);
+struct page *page, bool reserved);
+
 extern void hugetlb_cgroup_uncharge_cgroup(int idx, unsigned long nr_pages,
-  struct hugetlb_cgroup *h_cg);
+  struct hugetlb_cgroup *h_cg,
+  bool reserved);
+extern void hugetlb_cgroup_uncharge_counter(struct page_counter *p,
+   unsigned long nr_pages);
+
 extern void hugetlb_cgroup_file_init(void) __init;
 extern void hugetlb_cgroup_migrate(struct page *oldhpage,
   struct page *newhpage);

 #else
-static inline struct hugetlb_cgroup *hugetlb_cgroup_from_page(struct page 
*page)
+static inline struct hugetlb_cgroup *hugetlb_cgroup_from_page(struct page 
*page,
+ bool reserved)
 {
return NULL;
 }

-static inline
-int set_hugetlb_cgroup(struct page *page, struct hugetlb_cgroup *h_cg)
+static inline int set_hugetlb_cgroup(struct page *page,
+struct hugetlb_cgroup *h_cg, bool reserved)
 {
return 0;
 }
@@ -81,28 +96,30 @@ static inline bool hugetlb_cgroup_disabled(void)
return true;
 }

-static inline int
-hugetlb_cgroup_charge_cgroup(int idx, unsigned long nr_pages,
-struct hugetlb_cgroup **ptr)
+static inline int hugetlb_cgroup_charge_cgroup(int idx, unsigned long nr_pages,
+  struct hugetlb_cgroup **ptr,
+  bool reserved)
 {
return 0;
 }

-static inline void
-hugetlb_cgroup_commit_charge(int idx, unsigned long nr_pages,
-struct hugetlb_cgroup *h_cg,
-struct page *page)
+static inline void hugetlb_cgroup_commit_charge(int idx, unsigned long

[PATCH v6 1/9] hugetlb_cgroup: Add hugetlb_cgroup reservation counter

2019-10-12 Thread Mina Almasry

These counters will track hugetlb reservations rather than hugetlb
memory faulted in. This patch only adds the counter, following patches
add the charging and uncharging of the counter.

Problem:
Currently tasks attempting to allocate more hugetlb memory than is available get
a failure at mmap/shmget time. This is thanks to Hugetlbfs Reservations [1].
However, if a task attempts to allocate hugetlb memory only more than its
hugetlb_cgroup limit allows, the kernel will allow the mmap/shmget call,
but will SIGBUS the task when it attempts to fault the memory in.

We have developers interested in using hugetlb_cgroups, and they have expressed
dissatisfaction regarding this behavior. We'd like to improve this
behavior such that tasks violating the hugetlb_cgroup limits get an error on
mmap/shmget time, rather than getting SIGBUS'd when they try to fault
the excess memory in.

The underlying problem is that today's hugetlb_cgroup accounting happens
at hugetlb memory *fault* time, rather than at *reservation* time.
Thus, enforcing the hugetlb_cgroup limit only happens at fault time, and
the offending task gets SIGBUS'd.

Proposed Solution:
A new page counter named hugetlb.xMB.reservation_[limit|usage]_in_bytes. This
counter has slightly different semantics than
hugetlb.xMB.[limit|usage]_in_bytes:

- While usage_in_bytes tracks all *faulted* hugetlb memory,
reservation_usage_in_bytes tracks all *reserved* hugetlb memory and
hugetlb memory faulted in without a prior reservation.

- If a task attempts to reserve more memory than limit_in_bytes allows,
the kernel will allow it to do so. But if a task attempts to reserve
more memory than reservation_limit_in_bytes, the kernel will fail this
reservation.

This proposal is implemented in this patch series, with tests to verify
functionality and show the usage. We also added cgroup-v2 support to
hugetlb_cgroup so that the new use cases can be extended to v2.

Alternatives considered:
1. A new cgroup, instead of only a new page_counter attached to
   the existing hugetlb_cgroup. Adding a new cgroup seemed like a lot of code
   duplication with hugetlb_cgroup. Keeping hugetlb related page counters under
   hugetlb_cgroup seemed cleaner as well.

2. Instead of adding a new counter, we considered adding a sysctl that modifies
   the behavior of hugetlb.xMB.[limit|usage]_in_bytes, to do accounting at
   reservation time rather than fault time. Adding a new page_counter seems
   better as userspace could, if it wants, choose to enforce different cgroups
   differently: one via limit_in_bytes, and another via
   reservation_limit_in_bytes. This could be very useful if you're
   transitioning how hugetlb memory is partitioned on your system one
   cgroup at a time, for example. Also, someone may find usage for both
   limit_in_bytes and reservation_limit_in_bytes concurrently, and this
   approach gives them the option to do so.

Testing:
- Added tests passing.
- libhugetlbfs tests mostly passing, but some tests have trouble with and
  without this patch series. Seems environment issue rather than code:
  - Overall results:
** TEST SUMMARY
*  2M
*  32-bit 64-bit
* Total testcases:84  0
* Skipped: 0  0
*PASS:66  0
*FAIL:14  0
*Killed by signal: 0  0
*   Bad configuration: 4  0
*   Expected FAIL: 0  0
* Unexpected PASS: 0  0
*Test not present: 0  0
* Strange test result: 0  0
**
  - Failing tests:
- elflink_rw_and_share_test("linkhuge_rw") segfaults with and without this
  patch series.
- LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes malloc (2M: 32):
  FAILAddress is not hugepage
- LD_PRELOAD=libhugetlbfs.so HUGETLB_RESTRICT_EXE=unknown:malloc
  HUGETLB_MORECORE=yes malloc (2M: 32):
  FAILAddress is not hugepage
- LD_PRELOAD=libhugetlbfs.so HUGETLB_MORECORE=yes malloc_manysmall (2M: 32):
  FAILAddress is not hugepage
- GLIBC_TUNABLES=glibc.malloc.tcache_count=0 LD_PRELOAD=libhugetlbfs.so
  HUGETLB_MORECORE=yes heapshrink (2M: 32):
  FAILHeap not on hugepages
- GLIBC_TUNABLES=glibc.malloc.tcache_count=0 LD_PRELOAD=libhugetlbfs.so
  libheapshrink.so HUGETLB_MORECORE=yes heapshrink (2M: 32):
  FAILHeap not on hugepages
- HUGETLB_ELFMAP=RW linkhuge_rw (2M: 32): FAILsmall_data is not hugepage
- HUGETLB_ELFMAP=RW HUGETLB_MINIMAL_COPY=no linkhuge_rw (2M: 32):
  FAILsmall_data is not hugepage
- alloc-instantiate-race shared (2M: 32):
  Bad configuration: sched_setaffinity(cpu1): Invalid argument -
  FAILChild 1 killed by signal Killed
- shmoverride_linked (2M: 32):
  FAILshmget failed size 2097152 from line 176: Invalid argument
- HUGETLB_SHM=yes shmoverride_linked (2M: 32):
  FAILshmget failed

[PATCH v6 3/9] hugetlb_cgroup: add cgroup-v2 support

2019-10-12 Thread Mina Almasry

---
 mm/hugetlb_cgroup.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
index 854117513979b..ac1500205faf7 100644
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -503,8 +503,13 @@ static void __init __hugetlb_cgroup_file_init(int idx)
cft = >cgroup_files[HUGETLB_RES_NULL];
memset(cft, 0, sizeof(*cft));

-   WARN_ON(cgroup_add_legacy_cftypes(_cgrp_subsys,
- h->cgroup_files));
+   if (cgroup_subsys_on_dfl(hugetlb_cgrp_subsys)) {
+   WARN_ON(cgroup_add_dfl_cftypes(_cgrp_subsys,
+  h->cgroup_files));
+   } else {
+   WARN_ON(cgroup_add_legacy_cftypes(_cgrp_subsys,
+ h->cgroup_files));
+   }
 }

 void __init hugetlb_cgroup_file_init(void)
@@ -548,8 +553,14 @@ void hugetlb_cgroup_migrate(struct page *oldhpage, struct 
page *newhpage)
return;
 }

+static struct cftype hugetlb_files[] = {
+   {} /* terminate */
+};
+
 struct cgroup_subsys hugetlb_cgrp_subsys = {
.css_alloc  = hugetlb_cgroup_css_alloc,
.css_offline= hugetlb_cgroup_css_offline,
.css_free   = hugetlb_cgroup_css_free,
+   .dfl_cftypes = hugetlb_files,
+   .legacy_cftypes = hugetlb_files,
 };
--
2.23.0.700.g56cf767bdb-goog

[PATCH v6 5/9] hugetlb: disable region_add file_region coalescing

2019-10-12 Thread Mina Almasry

A follow up patch in this series adds hugetlb cgroup uncharge info the
file_region entries in resv->regions. The cgroup uncharge info may
differ for different regions, so they can no longer be coalesced at
region_add time. So, disable region coalescing in region_add in this
patch.

Behavior change:

Say a resv_map exists like this [0->1], [2->3], and [5->6].

Then a region_chg/add call comes in region_chg/add(f=0, t=5).

Old code would generate resv->regions: [0->5], [5->6].
New code would generate resv->regions: [0->1], [1->2], [2->3], [3->5],
[5->6].

Special care needs to be taken to handle the resv->adds_in_progress
variable correctly. In the past, only 1 region would be added for every
region_chg and region_add call. But now, each call may add multiple
regions, so we can no longer increment adds_in_progress by 1 in region_chg,
or decrement adds_in_progress by 1 after region_add or region_abort. Instead,
region_chg calls add_reservation_in_range() to count the number of regions
needed and allocates those, and that info is passed to region_add and
region_abort to decrement adds_in_progress correctly.

Signed-off-by: Mina Almasry 

---

Changes in v6:
- Fix bug in number of region_caches allocated by region_chg

---
 mm/hugetlb.c | 256 +--
 1 file changed, 147 insertions(+), 109 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 4a60d7d44b4c3..f9c1947925bb9 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -244,6 +244,12 @@ struct file_region {
long to;
 };

+/* Helper that removes a struct file_region from the resv_map cache and returns
+ * it for use.
+ */
+static struct file_region *
+get_file_region_entry_from_cache(struct resv_map *resv, long from, long to);
+
 /* Must be called with resv->lock held. Calling this with count_only == true
  * will count the number of pages to be added but will not modify the linked
  * list.
@@ -251,51 +257,61 @@ struct file_region {
 static long add_reservation_in_range(struct resv_map *resv, long f, long t,
 bool count_only)
 {
-   long chg = 0;
+   long add = 0;
struct list_head *head = >regions;
+   long last_accounted_offset = f;
struct file_region *rg = NULL, *trg = NULL, *nrg = NULL;

-   /* Locate the region we are before or in. */
-   list_for_each_entry (rg, head, link)
-   if (f <= rg->to)
-   break;
-
-   /* Round our left edge to the current segment if it encloses us. */
-   if (f > rg->from)
-   f = rg->from;
-
-   chg = t - f;
+   /* In this loop, we essentially handle an entry for the range
+* last_accounted_offset -> rg->from, at every iteration, with some
+* bounds checking.
+*/
+   list_for_each_entry_safe(rg, trg, head, link) {
+   /* Skip irrelevant regions that start before our range. */
+   if (rg->from < f) {
+   /* If this region ends after the last accounted offset,
+* then we need to update last_accounted_offset.
+*/
+   if (rg->to > last_accounted_offset)
+   last_accounted_offset = rg->to;
+   continue;
+   }

-   /* Check for and consume any regions we now overlap with. */
-   nrg = rg;
-   list_for_each_entry_safe (rg, trg, rg->link.prev, link) {
-   if (>link == head)
-   break;
+   /* When we find a region that starts beyond our range, we've
+* finished.
+*/
if (rg->from > t)
break;

-   /* We overlap with this area, if it extends further than
-* us then we must extend ourselves.  Account for its
-* existing reservation.
+   /* Add an entry for last_accounted_offset -> rg->from, and
+* update last_accounted_offset.
 */
-   if (rg->to > t) {
-   chg += rg->to - t;
-   t = rg->to;
+   if (rg->from > last_accounted_offset) {
+   add += rg->from - last_accounted_offset;
+   if (!count_only) {
+   nrg = get_file_region_entry_from_cache(
+   resv, last_accounted_offset, rg->from);
+   list_add(>link, rg->link.prev);
+   }
}
-   chg -= rg->to - rg->from;

-   if (!count_only && rg != nrg) {
-   list_del(>link);
-   kfree(rg);
-   }
+   last_accounted_offset = rg->to;
}

-   if (!count_only) {
-   nrg->from = f;
-   nrg->to = t;
+   /* Handle the case where our range extends beyond
+

Re: [PATCH 4/7] Teach SELinux about a new userfaultfd class

2019-10-12 Thread Daniel Colascione

On Sat, Oct 12, 2019 at 4:09 PM Andy Lutomirski  wrote:
>
> On Sat, Oct 12, 2019 at 12:16 PM Daniel Colascione  wrote:
> >
> > Use the secure anonymous inode LSM hook we just added to let SELinux
> > policy place restrictions on userfaultfd use. The create operation
> > applies to processes creating new instances of these file objects;
> > transfer between processes is covered by restrictions on read, write,
> > and ioctl access already checked inside selinux_file_receive.
>
> This is great, and I suspect we'll want it for things like SGX, too.
> But the current design seems like it will make it essentially
> impossible for SELinux to reference an anon_inode class whose
> file_operations are in a module, and moving file_operations out of a
> module would be nasty.
>
> Could this instead be keyed off a new struct anon_inode_class, an
> enum, or even just a string?

The new LSM hook already receives the string that callers pass to the
anon_inode APIs; modules can look at that instead of the fops if they
want. The reason to pass both the name and the fops through the hook
is to allow LSMs to match using fops comparison (which seems less
prone to breakage) when possible and rely on string matching when it
isn't.

Re: [PATCH bpf v2] libbpf: fix passing uninitialized bytes to setsockopt

2019-10-12 Thread Alexei Starovoitov

On Wed, Oct 09, 2019 at 06:49:29PM +0200, Ilya Maximets wrote:
> 'struct xdp_umem_reg' has 4 bytes of padding at the end that makes
> valgrind complain about passing uninitialized stack memory to the
> syscall:
> 
>   Syscall param socketcall.setsockopt() points to uninitialised byte(s)
> at 0x4E7AB7E: setsockopt (in /usr/lib64/libc-2.29.so)
> by 0x4BDE035: xsk_umem__create@@LIBBPF_0.0.4 (xsk.c:172)
>   Uninitialised value was created by a stack allocation
> at 0x4BDDEBA: xsk_umem__create@@LIBBPF_0.0.4 (xsk.c:140)
> 
> Padding bytes appeared after introducing of a new 'flags' field.
> memset() is required to clear them.
> 
> Fixes: 10d30e301732 ("libbpf: add flags to umem config")
> Signed-off-by: Ilya Maximets 
> ---
> 
> Version 2:
>   * Struct initializer replaced with explicit memset(). [Andrii]
> 
>  tools/lib/bpf/xsk.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
> index a902838f9fcc..9d5348086203 100644
> --- a/tools/lib/bpf/xsk.c
> +++ b/tools/lib/bpf/xsk.c
> @@ -163,6 +163,7 @@ int xsk_umem__create_v0_0_4(struct xsk_umem **umem_ptr, 
> void *umem_area,
>   umem->umem_area = umem_area;
>   xsk_set_umem_config(>config, usr_config);
>  
> + memset(, 0, sizeof(mr));
>   mr.addr = (uintptr_t)umem_area;
>   mr.len = size;
>   mr.chunk_size = umem->config.frame_size;

This was already applied. Why did you resend?

Re: [PATCH 3/3] rtc: ds1685: add indirect access method and remove plat_read/plat_write

2019-10-12 Thread Joshua Kinard

On 10/11/2019 11:05, Thomas Bogendoerfer wrote:
> Use of provided plat_read/plat_write introduces the problem of possible
> different lifetime of rtc driver and plat_XXX function provider. As
> this was only intended for SGI Octane (IP30) this patchset implements
> a register indirect access method for IP30 and introduces an
> access_type field in platform data to select how registers are
> accessed. And since there are no resource allocating stunts needed
> anymore it also gets rid of alloc_io_resources from platform data.
> 

Actually, I did it this way because IP32 was already in-tree, and IP30 was
not.  So the default ds1685_{read,write} functions were geared for the
in-tree machine, and IP30 brought along its own versions.  If IP30 support
gets merged into the kernel, this isn't needed anymore, but I don't think
this explanation accurately captures that.

The chief difference between IP32 and IP30's manner of accessing the RTC
is that IP32 has a 256-byte gap between each RTC register for unknown
reasons (this is documented in the IP32 hardware data sheets I have), and
access has to be MMIO'ed, since the RTC is hanging off of the MACE PCI
structs, like every other device in IP32's code.  IP30 doesn't have this
register gap to worry about, and it accesses the RTC registers via PIO.


> Signed-off-by: Thomas Bogendoerfer 
> ---
>  arch/mips/sgi-ip32/ip32-platform.c |  2 +-
>  drivers/rtc/rtc-ds1685.c   | 67 
> --
>  include/linux/rtc/ds1685.h |  8 +++--
>  3 files changed, 48 insertions(+), 29 deletions(-)
> 
> diff --git a/arch/mips/sgi-ip32/ip32-platform.c 
> b/arch/mips/sgi-ip32/ip32-platform.c
> index 5a2a82148d8d..c3909bd8dd1a 100644
> --- a/arch/mips/sgi-ip32/ip32-platform.c
> +++ b/arch/mips/sgi-ip32/ip32-platform.c
> @@ -115,7 +115,7 @@ ip32_rtc_platform_data[] = {
>   .bcd_mode = true,
>   .no_irq = false,
>   .uie_unsupported = false,
> - .alloc_io_resources = true,
> + .access_type = ds1685_reg_direct,
>   .plat_prepare_poweroff = ip32_prepare_poweroff,
>   },
>  };
> diff --git a/drivers/rtc/rtc-ds1685.c b/drivers/rtc/rtc-ds1685.c
> index 349a8d1caca1..9c5d064ebb6c 100644
> --- a/drivers/rtc/rtc-ds1685.c
> +++ b/drivers/rtc/rtc-ds1685.c
> @@ -59,6 +59,32 @@ ds1685_write(struct ds1685_priv *rtc, int reg, u8 value)
>  }
>  /* --- */
>  
> +/* Indirect read/write functions */
> +
> +/**
> + * ds1685_indir_read - read a value from an rtc register.
> + * @rtc: pointer to the ds1685 rtc structure.
> + * @reg: the register address to read.
> + */
> +static u8
> +ds1685_indir_read(struct ds1685_priv *rtc, int reg)
> +{
> + writeb(reg, rtc->regs);
> + return readb(rtc->data);
> +}
> +
> +/**
> + * ds1685_indir_write - write a value to an rtc register.
> + * @rtc: pointer to the ds1685 rtc structure.
> + * @reg: the register address to write.
> + * @value: value to write to the register.
> + */
> +static void
> +ds1685_indir_write(struct ds1685_priv *rtc, int reg, u8 value)
> +{
> + writeb(reg, rtc->regs);
> + writeb(value, rtc->data);
> +}

IP30 applied a mask of 0x7f on the 'reg' parameter on both of its
read/write functions, which was from Stan's original code.  Is this mask
not needed any more with the other changes you made to the IP30 code?  I
remember trying to do without this mask once long ago, and something broke,
so I have left it in ever since.

>  
>  /* --- */
>  /* Inlined functions */
> @@ -1062,16 +1088,25 @@ ds1685_rtc_probe(struct platform_device *pdev)
>   if (!rtc)
>   return -ENOMEM;
>  
> - /*
> -  * Allocate/setup any IORESOURCE_MEM resources, if required.  Not all
> -  * platforms put the RTC in an easy-access place.  Like the SGI Octane,
> -  * which attaches the RTC to a "ByteBus", hooked to a SuperIO chip
> -  * that sits behind the IOC3 PCI metadevice.
> -  */
> - if (pdata->alloc_io_resources) {
> + /* Setup resources and access functions */
> + switch (pdata->access_type) {
> + case ds1685_reg_direct:
> + rtc->regs = devm_platform_ioremap_resource(pdev, 0);
> + if (IS_ERR(rtc->regs))
> + return PTR_ERR(rtc->regs);
> + rtc->read = ds1685_read;
> + rtc->write = ds1685_write;
> + break;
> + case ds1685_reg_indirect:
>   rtc->regs = devm_platform_ioremap_resource(pdev, 0);
>   if (IS_ERR(rtc->regs))
>   return PTR_ERR(rtc->regs);
> + rtc->data = devm_platform_ioremap_resource(pdev, 1);
> + if (IS_ERR(rtc->data))
> + return PTR_ERR(rtc->data);
> + rtc->read = ds1685_indir_read;
> + rtc->write = ds1685_indir_write;
> + break;
>   }

I

Re: [PATCH v5 bpf-next 00/15] samples: bpf: improve/fix cross-compilation

2019-10-12 Thread Alexei Starovoitov

On Fri, Oct 11, 2019 at 5:07 AM Ilias Apalodimas
 wrote:
>
> On Fri, Oct 11, 2019 at 03:27:53AM +0300, Ivan Khoronzhuk wrote:
> > This series contains mainly fixes/improvements for cross-compilation
> > but not only, tested for arm, arm64, and intended for any arch.
> > Also verified on native build (not cross compilation) for x86_64
> > and arm, arm64.
...
> For native compilation on x86_64 and aarch64
>
> Tested-by: Ilias Apalodimas 

Applied. Thanks

Re: [PATCH 6/7] Allow users to require UFFD_SECURE

2019-10-12 Thread Andy Lutomirski

On Sat, Oct 12, 2019 at 12:16 PM Daniel Colascione  wrote:
>
> This change adds 2 as an allowable value for
> unprivileged_userfaultfd. (Previously, this sysctl could be either 0
> or 1.) When unprivileged_userfaultfd is 2, users with CAP_SYS_PTRACE
> may create userfaultfd with or without UFFD_SECURE, but users without
> CAP_SYS_PTRACE must pass UFFD_SECURE to userfaultfd in order for the
> system call to succeed, effectively forcing them to opt into
> additional security checks.

This patch can go away entirely if you make UFFD_SECURE automatic.

Re: [PATCH 3/7] Add a UFFD_SECURE flag to the userfaultfd API.

2019-10-12 Thread Andy Lutomirski

On Sat, Oct 12, 2019 at 12:16 PM Daniel Colascione  wrote:
>
> The new secure flag makes userfaultfd use a new "secure" anonymous
> file object instead of the default one, letting security modules
> supervise userfaultfd use.
>
> Requiring that users pass a new flag lets us avoid changing the
> semantics for existing callers.

Is there any good reason not to make this be the default?

The only downside I can see is that it would increase the memory usage
of userfaultfd(), but that doesn't seem like such a big deal.  A
lighter-weight alternative would be to have a single inode shared by
all userfaultfd instances, which would require a somewhat different
internal anon_inode API.

In any event, I don't think that "make me visible to SELinux" should
be a choice that user code makes.

--Andy

Re: [PATCH 4/7] Teach SELinux about a new userfaultfd class

2019-10-12 Thread Andy Lutomirski

On Sat, Oct 12, 2019 at 12:16 PM Daniel Colascione  wrote:
>
> Use the secure anonymous inode LSM hook we just added to let SELinux
> policy place restrictions on userfaultfd use. The create operation
> applies to processes creating new instances of these file objects;
> transfer between processes is covered by restrictions on read, write,
> and ioctl access already checked inside selinux_file_receive.

This is great, and I suspect we'll want it for things like SGX, too.
But the current design seems like it will make it essentially
impossible for SELinux to reference an anon_inode class whose
file_operations are in a module, and moving file_operations out of a
module would be nasty.

Could this instead be keyed off a new struct anon_inode_class, an
enum, or even just a string?

--Andy

Re: [PATCH 1/7 v2] tracefs: Revert ccbd54ff54e8 ("tracefs: Restrict tracefs when the kernel is locked down")

2019-10-12 Thread Linus Torvalds

On Fri, Oct 11, 2019 at 5:59 PM Steven Rostedt  wrote:
>
>
> I bisected this down to the addition of the proxy_ops into tracefs for
> lockdown. It appears that the allocation of the proxy_ops and then freeing
> it in the destroy_inode callback, is causing havoc with the memory system.
> Reading the documentation about destroy_inode and talking with Linus about
> this, this is buggy and wrong.

Can you still add the explanation about the inode memory leak to this message?

Right now it just says "it's buggy and wrong". True. But doesn't
explain _why_ it is buggy and wrong.

  Linus

Re: [GIT PULL] Staging/IIO driver fixes for 5.4-rc3

2019-10-12 Thread pr-tracker-bot

The pull request you sent on Sat, 12 Oct 2019 18:16:38 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git 
> tags/staging-5.4-rc3

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/9cbc63485fd5e25cef5d64c28ca3318364073773

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT PULL] Char/Misc driver fixes for 5.4-rc3

2019-10-12 Thread pr-tracker-bot

The pull request you sent on Sat, 12 Oct 2019 18:16:59 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git 
> tags/char-misc-5.4-rc3

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/da94001239cceb93c132a31928d6ddc4214862d5

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT PULL] USB fixes for 5.4-rc3

2019-10-12 Thread pr-tracker-bot

The pull request you sent on Sat, 12 Oct 2019 18:15:53 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git tags/usb-5.4-rc3

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/6c90bbd0a4e133665128a941ffcb4f7ac5dcb3cf

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT PULL] TTY/Serial fixes for 5.4-rc3

2019-10-12 Thread pr-tracker-bot

The pull request you sent on Sat, 12 Oct 2019 18:16:14 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git tags/tty-5.4-rc3

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/82c87e7d4068d0fc368c3e7356a94e7b87c29544

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [PATCH 2/3] rtc: ds1685: use devm_platform_ioremap_resource helper

2019-10-12 Thread Joshua Kinard

On 10/11/2019 11:05, Thomas Bogendoerfer wrote:
> Simplify ioremapping of registers by using devm_platform_ioremap_resource.
> 
> Signed-off-by: Thomas Bogendoerfer 
> ---
>  drivers/rtc/rtc-ds1685.c   | 23 +++
>  include/linux/rtc/ds1685.h |  1 -
>  2 files changed, 3 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/rtc/rtc-ds1685.c b/drivers/rtc/rtc-ds1685.c
> index 51f568473de8..349a8d1caca1 100644
> --- a/drivers/rtc/rtc-ds1685.c
> +++ b/drivers/rtc/rtc-ds1685.c
> @@ -1040,7 +1040,6 @@ static int
>  ds1685_rtc_probe(struct platform_device *pdev)
>  {
>   struct rtc_device *rtc_dev;
> - struct resource *res;
>   struct ds1685_priv *rtc;
>   struct ds1685_rtc_platform_data *pdata;
>   u8 ctrla, ctrlb, hours;
> @@ -1070,25 +1069,9 @@ ds1685_rtc_probe(struct platform_device *pdev)
>* that sits behind the IOC3 PCI metadevice.
>*/
>   if (pdata->alloc_io_resources) {
> - /* Get the platform resources. */
> - res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> - if (!res)
> - return -ENXIO;
> - rtc->size = resource_size(res);
> -
> - /* Request a memory region. */
> - /* XXX: mmio-only for now. */
> - if (!devm_request_mem_region(>dev, res->start, rtc->size,
> -  pdev->name))
> - return -EBUSY;
> -
> - /*
> -  * Set the base address for the rtc, and ioremap its
> -  * registers.
> -  */
> - rtc->regs = devm_ioremap(>dev, res->start, rtc->size);
> - if (!rtc->regs)
> - return -ENOMEM;
> + rtc->regs = devm_platform_ioremap_resource(pdev, 0);
> + if (IS_ERR(rtc->regs))
> + return PTR_ERR(rtc->regs);
>   }
>  
>   /* Get the register step size. */
> diff --git a/include/linux/rtc/ds1685.h b/include/linux/rtc/ds1685.h
> index b9671d00d964..101c7adc05a2 100644
> --- a/include/linux/rtc/ds1685.h
> +++ b/include/linux/rtc/ds1685.h
> @@ -43,7 +43,6 @@ struct ds1685_priv {
>   struct rtc_device *dev;
>   void __iomem *regs;
>   u32 regstep;
> - size_t size;
>   int irq_num;
>   bool bcd_mode;
>   bool no_irq;
> 


Acked-by: Joshua Kinard

Re: [PATCH 1/3] rts: ds1685: remove not needed fields from private struct

2019-10-12 Thread Joshua Kinard

On 10/11/2019 11:05, Thomas Bogendoerfer wrote:
> A few of the fields in struct ds1685_priv aren't needed at all,
> so we can remove it.
> 
> Signed-off-by: Thomas Bogendoerfer 
> ---
>  drivers/rtc/rtc-ds1685.c   | 3 ---
>  include/linux/rtc/ds1685.h | 3 ---
>  2 files changed, 6 deletions(-)
> 
> diff --git a/drivers/rtc/rtc-ds1685.c b/drivers/rtc/rtc-ds1685.c
> index 184e4a3e2bef..51f568473de8 100644
> --- a/drivers/rtc/rtc-ds1685.c
> +++ b/drivers/rtc/rtc-ds1685.c
> @@ -1086,12 +1086,10 @@ ds1685_rtc_probe(struct platform_device *pdev)
>* Set the base address for the rtc, and ioremap its
>* registers.
>*/
> - rtc->baseaddr = res->start;
>   rtc->regs = devm_ioremap(>dev, res->start, rtc->size);
>   if (!rtc->regs)
>   return -ENOMEM;
>   }
> - rtc->alloc_io_resources = pdata->alloc_io_resources;
>  
>   /* Get the register step size. */
>   if (pdata->regstep > 0)
> @@ -1271,7 +1269,6 @@ ds1685_rtc_probe(struct platform_device *pdev)
>   /* See if the platform doesn't support UIE. */
>   if (pdata->uie_unsupported)
>   rtc_dev->uie_unsupported = 1;
> - rtc->uie_unsupported = pdata->uie_unsupported;
>  
>   rtc->dev = rtc_dev;
>  
> diff --git a/include/linux/rtc/ds1685.h b/include/linux/rtc/ds1685.h
> index 43aec568ba7c..b9671d00d964 100644
> --- a/include/linux/rtc/ds1685.h
> +++ b/include/linux/rtc/ds1685.h
> @@ -43,13 +43,10 @@ struct ds1685_priv {
>   struct rtc_device *dev;
>   void __iomem *regs;
>   u32 regstep;
> - resource_size_t baseaddr;
>   size_t size;
>   int irq_num;
>   bool bcd_mode;
>   bool no_irq;
> - bool uie_unsupported;
> - bool alloc_io_resources;
>   u8 (*read)(struct ds1685_priv *, int);
>   void (*write)(struct ds1685_priv *, int, u8);
>   void (*prepare_poweroff)(void);
> 

Acked-by: Joshua Kinard

Re: [PATCH net 0/2] vsock: don't allow half-closed socket in the host transports

2019-10-12 Thread Michael S. Tsirkin

On Fri, Oct 11, 2019 at 04:34:57PM +0200, Stefano Garzarella wrote:
> On Fri, Oct 11, 2019 at 10:19:13AM -0400, Michael S. Tsirkin wrote:
> > On Fri, Oct 11, 2019 at 03:07:56PM +0200, Stefano Garzarella wrote:
> > > We are implementing a test suite for the VSOCK sockets and we discovered
> > > that vmci_transport never allowed half-closed socket on the host side.
> > > 
> > > As Jorgen explained [1] this is due to the implementation of VMCI.
> > > 
> > > Since we want to have the same behaviour across all transports, this
> > > series adds a section in the "Implementation notes" to exaplain this
> > > behaviour, and changes the vhost_transport to behave the same way.
> > > 
> > > [1] https://patchwork.ozlabs.org/cover/847998/#1831400
> > 
> > Half closed sockets are very useful, and lots of
> > applications use tricks to swap a vsock for a tcp socket,
> > which might as a result break.
> 
> Got it!
> 
> > 
> > If VMCI really cares it can implement an ioctl to
> > allow applications to detect that half closed sockets aren't supported.
> > 
> > It does not look like VMCI wants to bother (users do not read
> > kernel implementation notes) so it does not really care.
> > So why do we want to cripple other transports intentionally?
> 
> The main reason is that we are developing the test suite and we noticed
> the miss match. Since we want to make sure that applications behave in
> the same way on different transports, we thought we would solve it that
> way.
> 
> But what you are saying (also in the reply of the patches) is actually
> quite right. Not being publicized, applications do not expect this behavior,
> so please discard this series.
> 
> My problem during the tests, was trying to figure out if half-closed
> sockets were supported or not, so as you say adding an IOCTL or maybe
> better a getsockopt() could solve the problem.
> 
> What do you think?
> 
> Thanks,
> Stefano

Sure, why not.

Re: [GIT PULL] perf fixes

2019-10-12 Thread pr-tracker-bot

The pull request you sent on Sat, 12 Oct 2019 15:31:34 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
> perf-urgent-for-linus

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/465a7e291fd4f056d81baf5d5ed557bdb44c5457

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT PULL] EFI fixes

2019-10-12 Thread pr-tracker-bot

The pull request you sent on Sat, 12 Oct 2019 15:01:39 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git efi-urgent-for-linus

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/9b4e40c8fe1e120fef93985de7ff6a97fe9e7dd3

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT PULL] x86 fixes

2019-10-12 Thread pr-tracker-bot

The pull request you sent on Sat, 12 Oct 2019 15:19:16 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86-urgent-for-linus

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/7a275fd7b9519b5cc63270a8964055aadb04de26

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT PULL] x86 license updates

2019-10-12 Thread pr-tracker-bot

The pull request you sent on Sat, 12 Oct 2019 13:52:57 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
> core-urgent-for-linus

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/e9ec3588a9372dfb9b04afcddb199ad9e2be0044

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT PULL] scheduler fixes

2019-10-12 Thread pr-tracker-bot

The pull request you sent on Sat, 12 Oct 2019 16:58:36 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
> sched-urgent-for-linus

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/328fefadd9cfa15cd6ab746553d9ef13303c11a6

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [PATCH] netdevsim: Fix error handling in nsim_fib_init and nsim_fib_exit

2019-10-12 Thread Jakub Kicinski

On Fri, 11 Oct 2019 17:46:53 +0800, YueHaibing wrote:
> In nsim_fib_init(), if register_fib_notifier failed, nsim_fib_net_ops
> should be unregistered before return.
> 
> In nsim_fib_exit(), unregister_fib_notifier should be called before
> nsim_fib_net_ops be unregistered, otherwise may cause use-after-free:
> 
> BUG: KASAN: use-after-free in nsim_fib_event_nb+0x342/0x570 [netdevsim]
> Read of size 8 at addr 8881daaf4388 by task kworker/0:3/3499
> 

> Reported-by: Hulk Robot 
> Fixes: 59c84b9fcf42 ("netdevsim: Restore per-network namespace accounting for 
> fib entries")
> Signed-off-by: YueHaibing 

Acked-by: Jakub Kicinski

Re: [Outreachy kernel] [PATCH v2 3/5] staging: octeon: remove typedef declaration for cvmx_fau_reg_32

2019-10-12 Thread Wambui Karuga

On Sat, Oct 12, 2019 at 08:37:18PM +0200, Julia Lawall wrote:
> 
> 
> On Sat, 12 Oct 2019, Wambui Karuga wrote:
> 
> > Remove typedef declaration for enum cvmx_fau_reg_32.
> > Also replace its previous uses with new declaration format.
> > Issue found by checkpatch.pl
> >
> > Signed-off-by: Wambui Karuga 
> > ---
> >  drivers/staging/octeon/octeon-stubs.h | 14 --
> >  1 file changed, 8 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/staging/octeon/octeon-stubs.h 
> > b/drivers/staging/octeon/octeon-stubs.h
> > index 0991be329139..40f0cfee0dff 100644
> > --- a/drivers/staging/octeon/octeon-stubs.h
> > +++ b/drivers/staging/octeon/octeon-stubs.h
> > @@ -201,9 +201,9 @@ union cvmx_helper_link_info {
> > } s;
> >  };
> >
> > -typedef enum {
> > +enum cvmx_fau_reg_32 {
> > CVMX_FAU_REG_32_START   = 0,
> > -} cvmx_fau_reg_32_t;
> > +};
> >
> >  typedef enum {
> > CVMX_FAU_OP_SIZE_8 = 0,
> > @@ -1178,16 +1178,18 @@ union cvmx_gmxx_rxx_rx_inbnd {
> > } s;
> >  };
> >
> > -static inline int32_t cvmx_fau_fetch_and_add32(cvmx_fau_reg_32_t reg,
> > +static inline int32_t cvmx_fau_fetch_and_add32(enum cvmx_fau_reg_32 reg,
> >int32_t value)
> 
> These int32_t's don't look very desirable either.  If there is only one
> possible definition, you can just replace it by what it is defined to be.
> 
> julia
> 
Ok, I'll look into refactoring this.

wambui karuga
> >  {
> > return value;
> >  }
> >
> > -static inline void cvmx_fau_atomic_add32(cvmx_fau_reg_32_t reg, int32_t 
> > value)
> > +static inline void cvmx_fau_atomic_add32(enum cvmx_fau_reg_32 reg,
> > +int32_t value)
> >  { }
> >
> > -static inline void cvmx_fau_atomic_write32(cvmx_fau_reg_32_t reg, int32_t 
> > value)
> > +static inline void cvmx_fau_atomic_write32(enum cvmx_fau_reg_32 reg,
> > +  int32_t value)
> >  { }
> >
> >  static inline uint64_t cvmx_scratch_read64(uint64_t address)
> > @@ -1364,7 +1366,7 @@ static inline int cvmx_spi_restart_interface(int 
> > interface,
> >  }
> >
> >  static inline void cvmx_fau_async_fetch_and_add32(uint64_t scraddr,
> > - cvmx_fau_reg_32_t reg,
> > + enum cvmx_fau_reg_32 reg,
> >   int32_t value)
> >  { }
> >
> > --
> > 2.23.0
> >
> > --
> > You received this message because you are subscribed to the Google Groups 
> > "outreachy-kernel" group.
> > To unsubscribe from this group and stop receiving emails from it, send an 
> > email to outreachy-kernel+unsubscr...@googlegroups.com.
> > To view this discussion on the web visit 
> > https://groups.google.com/d/msgid/outreachy-kernel/b7216f423d8e06b2ed7ac2df643a9215cd95be32.1570821661.git.wambui.karugax%40gmail.com.
> >
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "outreachy-kernel" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to outreachy-kernel+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/outreachy-kernel/alpine.DEB.2.21.1910122035380.3049%40hadrien.

[PATCH 0/2] Formatting and style cleanup in rtl8712

2019-10-12 Thread Wambui Karuga

This patch series addresses the use of unnecessary return variables and
line-breaks in function headers, both in
drivers/staging/rtl8712/rtl871x_mp_ioctl.c. 

Wambui Karuga (2):
  staging: rtl8712: remove unnecessary return variables
  staging: rtl8712: clean up function headers

 drivers/staging/rtl8712/rtl871x_mp_ioctl.c | 103 -
 1 file changed, 38 insertions(+), 65 deletions(-)

-- 
2.23.0

[PATCH 1/2] staging: rtl8712: remove unnecessary return variables

2019-10-12 Thread Wambui Karuga

Remove variables that are only used to hold and return constants and
have the functions directly return the constants.

Issue found by coccinelle:
@@
local idexpression ret;
expression e;
@@

-ret =
+return
 e;
-return ret;

Signed-off-by: Wambui Karuga 
---
 drivers/staging/rtl8712/rtl871x_mp_ioctl.c | 46 +-
 1 file changed, 19 insertions(+), 27 deletions(-)

diff --git a/drivers/staging/rtl8712/rtl871x_mp_ioctl.c 
b/drivers/staging/rtl8712/rtl871x_mp_ioctl.c
index aa8f8500cbb2..8af7892809ca 100644
--- a/drivers/staging/rtl8712/rtl871x_mp_ioctl.c
+++ b/drivers/staging/rtl8712/rtl871x_mp_ioctl.c
@@ -283,13 +283,12 @@ uint oid_rt_pro_stop_test_hdl(struct oid_par_priv 
*poid_par_priv)
 {
struct _adapter *Adapter = (struct _adapter *)
   (poid_par_priv->adapter_context);
-   uint status = RNDIS_STATUS_SUCCESS;
 
if (poid_par_priv->type_of_oid != SET_OID)
return RNDIS_STATUS_NOT_ACCEPTED;
if (mp_stop_test(Adapter) == _FAIL)
-   status = RNDIS_STATUS_NOT_ACCEPTED;
-   return status;
+   return RNDIS_STATUS_NOT_ACCEPTED;
+   return RNDIS_STATUS_SUCCESS;
 }
 
 uint oid_rt_pro_set_channel_direct_call_hdl(struct oid_par_priv
@@ -350,64 +349,58 @@ uint oid_rt_pro_set_tx_power_control_hdl(
 uint oid_rt_pro_query_tx_packet_sent_hdl(
struct oid_par_priv *poid_par_priv)
 {
-   uint status = RNDIS_STATUS_SUCCESS;
struct _adapter *Adapter = (struct _adapter *)
   (poid_par_priv->adapter_context);
 
-   if (poid_par_priv->type_of_oid != QUERY_OID) {
-   status = RNDIS_STATUS_NOT_ACCEPTED;
-   return status;
-   }
+   if (poid_par_priv->type_of_oid != QUERY_OID)
+   return RNDIS_STATUS_NOT_ACCEPTED;
+
if (poid_par_priv->information_buf_len == sizeof(u32)) {
*(u32 *)poid_par_priv->information_buf =
Adapter->mppriv.tx_pktcount;
*poid_par_priv->bytes_rw = poid_par_priv->information_buf_len;
} else {
-   status = RNDIS_STATUS_INVALID_LENGTH;
+   return RNDIS_STATUS_INVALID_LENGTH;
}
-   return status;
+   return RNDIS_STATUS_SUCCESS;
 }
 
 uint oid_rt_pro_query_rx_packet_received_hdl(
struct oid_par_priv *poid_par_priv)
 {
-   uint status = RNDIS_STATUS_SUCCESS;
struct _adapter *Adapter = (struct _adapter *)
   (poid_par_priv->adapter_context);
 
-   if (poid_par_priv->type_of_oid != QUERY_OID) {
-   status = RNDIS_STATUS_NOT_ACCEPTED;
-   return status;
-   }
+   if (poid_par_priv->type_of_oid != QUERY_OID)
+   return RNDIS_STATUS_NOT_ACCEPTED;
+
if (poid_par_priv->information_buf_len == sizeof(u32)) {
*(u32 *)poid_par_priv->information_buf =
Adapter->mppriv.rx_pktcount;
*poid_par_priv->bytes_rw = poid_par_priv->information_buf_len;
} else {
-   status = RNDIS_STATUS_INVALID_LENGTH;
+   return RNDIS_STATUS_INVALID_LENGTH;
}
-   return status;
+   return RNDIS_STATUS_SUCCESS;
 }
 
 uint oid_rt_pro_query_rx_packet_crc32_error_hdl(
struct oid_par_priv *poid_par_priv)
 {
-   uint status = RNDIS_STATUS_SUCCESS;
struct _adapter *Adapter = (struct _adapter *)
   (poid_par_priv->adapter_context);
 
-   if (poid_par_priv->type_of_oid != QUERY_OID) {
-   status = RNDIS_STATUS_NOT_ACCEPTED;
-   return status;
-   }
+   if (poid_par_priv->type_of_oid != QUERY_OID)
+   return RNDIS_STATUS_NOT_ACCEPTED;
+
if (poid_par_priv->information_buf_len == sizeof(u32)) {
*(u32 *)poid_par_priv->information_buf =
Adapter->mppriv.rx_crcerrpktcount;
*poid_par_priv->bytes_rw = poid_par_priv->information_buf_len;
} else {
-   status = RNDIS_STATUS_INVALID_LENGTH;
+   return RNDIS_STATUS_INVALID_LENGTH;
}
-   return status;
+   return RNDIS_STATUS_SUCCESS;
 }
 
 uint oid_rt_pro_reset_tx_packet_sent_hdl(struct oid_par_priv
@@ -425,7 +418,6 @@ uint oid_rt_pro_reset_tx_packet_sent_hdl(struct oid_par_priv
 uint oid_rt_pro_reset_rx_packet_received_hdl(struct oid_par_priv
*poid_par_priv)
 {
-   uint status = RNDIS_STATUS_SUCCESS;
struct _adapter *Adapter = (struct _adapter *)
   (poid_par_priv->adapter_context);
 
@@ -435,9 +427,9 @@ uint oid_rt_pro_reset_rx_packet_received_hdl(struct 
oid_par_priv
Adapter->mppriv.rx_pktcount = 0;

[PATCH 2/2] staging: rtl8712: clean up function headers

2019-10-12 Thread Wambui Karuga

Remove unnecessary line-breaks in function headers to
improve readability of function headers.

Signed-off-by: Wambui Karuga 
---
 drivers/staging/rtl8712/rtl871x_mp_ioctl.c | 57 --
 1 file changed, 19 insertions(+), 38 deletions(-)

diff --git a/drivers/staging/rtl8712/rtl871x_mp_ioctl.c 
b/drivers/staging/rtl8712/rtl871x_mp_ioctl.c
index 8af7892809ca..29b85330815f 100644
--- a/drivers/staging/rtl8712/rtl871x_mp_ioctl.c
+++ b/drivers/staging/rtl8712/rtl871x_mp_ioctl.c
@@ -231,8 +231,7 @@ static int mp_stop_test(struct _adapter *padapter)
return _SUCCESS;
 }
 
-uint oid_rt_pro_set_data_rate_hdl(struct oid_par_priv
-*poid_par_priv)
+uint oid_rt_pro_set_data_rate_hdl(struct oid_par_priv *poid_par_priv)
 {
struct _adapter *Adapter = (struct _adapter *)
   (poid_par_priv->adapter_context);
@@ -291,8 +290,7 @@ uint oid_rt_pro_stop_test_hdl(struct oid_par_priv 
*poid_par_priv)
return RNDIS_STATUS_SUCCESS;
 }
 
-uint oid_rt_pro_set_channel_direct_call_hdl(struct oid_par_priv
-  *poid_par_priv)
+uint oid_rt_pro_set_channel_direct_call_hdl(struct oid_par_priv *poid_par_priv)
 {
struct _adapter *Adapter = (struct _adapter *)
   (poid_par_priv->adapter_context);
@@ -327,8 +325,7 @@ uint oid_rt_pro_set_antenna_bb_hdl(struct oid_par_priv 
*poid_par_priv)
return RNDIS_STATUS_SUCCESS;
 }
 
-uint oid_rt_pro_set_tx_power_control_hdl(
-   struct oid_par_priv *poid_par_priv)
+uint oid_rt_pro_set_tx_power_control_hdl(struct oid_par_priv *poid_par_priv)
 {
struct _adapter *Adapter = (struct _adapter *)
   (poid_par_priv->adapter_context);
@@ -346,8 +343,7 @@ uint oid_rt_pro_set_tx_power_control_hdl(
return RNDIS_STATUS_SUCCESS;
 }
 
-uint oid_rt_pro_query_tx_packet_sent_hdl(
-   struct oid_par_priv *poid_par_priv)
+uint oid_rt_pro_query_tx_packet_sent_hdl(struct oid_par_priv *poid_par_priv)
 {
struct _adapter *Adapter = (struct _adapter *)
   (poid_par_priv->adapter_context);
@@ -365,8 +361,7 @@ uint oid_rt_pro_query_tx_packet_sent_hdl(
return RNDIS_STATUS_SUCCESS;
 }
 
-uint oid_rt_pro_query_rx_packet_received_hdl(
-   struct oid_par_priv *poid_par_priv)
+uint oid_rt_pro_query_rx_packet_received_hdl(struct oid_par_priv 
*poid_par_priv)
 {
struct _adapter *Adapter = (struct _adapter *)
   (poid_par_priv->adapter_context);
@@ -384,8 +379,7 @@ uint oid_rt_pro_query_rx_packet_received_hdl(
return RNDIS_STATUS_SUCCESS;
 }
 
-uint oid_rt_pro_query_rx_packet_crc32_error_hdl(
-   struct oid_par_priv *poid_par_priv)
+uint oid_rt_pro_query_rx_packet_crc32_error_hdl(struct oid_par_priv 
*poid_par_priv)
 {
struct _adapter *Adapter = (struct _adapter *)
   (poid_par_priv->adapter_context);
@@ -403,8 +397,7 @@ uint oid_rt_pro_query_rx_packet_crc32_error_hdl(
return RNDIS_STATUS_SUCCESS;
 }
 
-uint oid_rt_pro_reset_tx_packet_sent_hdl(struct oid_par_priv
-   *poid_par_priv)
+uint oid_rt_pro_reset_tx_packet_sent_hdl(struct oid_par_priv *poid_par_priv)
 {
struct _adapter *Adapter = (struct _adapter *)
   (poid_par_priv->adapter_context);
@@ -415,8 +408,7 @@ uint oid_rt_pro_reset_tx_packet_sent_hdl(struct oid_par_priv
return RNDIS_STATUS_SUCCESS;
 }
 
-uint oid_rt_pro_reset_rx_packet_received_hdl(struct oid_par_priv
-   *poid_par_priv)
+uint oid_rt_pro_reset_rx_packet_received_hdl(struct oid_par_priv 
*poid_par_priv)
 {
struct _adapter *Adapter = (struct _adapter *)
   (poid_par_priv->adapter_context);
@@ -432,8 +424,7 @@ uint oid_rt_pro_reset_rx_packet_received_hdl(struct 
oid_par_priv
return RNDIS_STATUS_SUCCESS;
 }
 
-uint oid_rt_reset_phy_rx_packet_count_hdl(struct oid_par_priv
-*poid_par_priv)
+uint oid_rt_reset_phy_rx_packet_count_hdl(struct oid_par_priv *poid_par_priv)
 {
struct _adapter *Adapter = (struct _adapter *)
   (poid_par_priv->adapter_context);
@@ -444,8 +435,7 @@ uint oid_rt_reset_phy_rx_packet_count_hdl(struct 
oid_par_priv
return RNDIS_STATUS_SUCCESS;
 }
 
-uint oid_rt_get_phy_rx_packet_received_hdl(struct oid_par_priv
- *poid_par_priv)
+uint oid_rt_get_phy_rx_packet_received_hdl(struct oid_par_priv *poid_par_priv)
 {
struct _adapter *Adapter = (struct _adapter *)
   (poid_par_priv->adapter_context);
@@ -460,8 +450,7 @@

[PATCH] drivers: firmware: psci: use kernel restart handler functionality

2019-10-12 Thread Stefan Agner

From: Stefan Agner 

Use the kernels restart handler to register the PSCI system reset
capability. The restart handler use notifier chains along with
priorities. This allows to use restart handlers with higher priority
(in case available) while still supporting PSCI.

Since the ARM handler had priority over the kernels restart handler
before this patch, use a slightly elevated priority of 160 to make
sure PSCI is used before most of the other handlers are called.

Signed-off-by: Stefan Agner 
---
 drivers/firmware/psci/psci.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c
index 84f4ff351c62..d8677b54132f 100644
--- a/drivers/firmware/psci/psci.c
+++ b/drivers/firmware/psci/psci.c
@@ -82,6 +82,7 @@ static u32 psci_function_id[PSCI_FN_MAX];
 
 static u32 psci_cpu_suspend_feature;
 static bool psci_system_reset2_supported;
+static struct notifier_block psci_restart_handler;
 
 static inline bool psci_has_ext_power_state(void)
 {
@@ -250,7 +251,8 @@ static int get_set_conduit_method(struct device_node *np)
return 0;
 }
 
-static void psci_sys_reset(enum reboot_mode reboot_mode, const char *cmd)
+static int psci_sys_reset(struct notifier_block *this,
+   unsigned long reboot_mode, void *cmd)
 {
if ((reboot_mode == REBOOT_WARM || reboot_mode == REBOOT_SOFT) &&
psci_system_reset2_supported) {
@@ -263,6 +265,8 @@ static void psci_sys_reset(enum reboot_mode reboot_mode, 
const char *cmd)
} else {
invoke_psci_fn(PSCI_0_2_FN_SYSTEM_RESET, 0, 0, 0);
}
+
+   return NOTIFY_DONE;
 }
 
 static void psci_sys_poweroff(void)
@@ -411,6 +415,8 @@ static void __init psci_init_smccc(void)
 
 static void __init psci_0_2_set_functions(void)
 {
+   int ret;
+
pr_info("Using standard PSCI v0.2 function IDs\n");
psci_ops.get_version = psci_get_version;
 
@@ -431,7 +437,14 @@ static void __init psci_0_2_set_functions(void)
 
psci_ops.migrate_info_type = psci_migrate_info_type;
 
-   arm_pm_restart = psci_sys_reset;
+   psci_restart_handler.notifier_call = psci_sys_reset;
+   psci_restart_handler.priority = 160;
+
+   ret = register_restart_handler(_restart_handler);
+   if (ret) {
+   pr_err("Cannot register restart handler, %d\n", ret);
+   return;
+   }
 
pm_power_off = psci_sys_poweroff;
 }
-- 
2.23.0

Re: [GIT PULL] MIPS fixes

2019-10-12 Thread pr-tracker-bot

The pull request you sent on Sat, 12 Oct 2019 19:04:14 +:

> git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux.git 
> tags/mips_fixes_5.4_2

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/63f9bff56beb718ac0a2eb8398a98220b1e119dc

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT PULL] xen: fixes for 5.4-rc3

2019-10-12 Thread pr-tracker-bot

The pull request you sent on Sat, 12 Oct 2019 12:51:31 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git 
> for-linus-5.4-rc3-tag

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/680b5b3c5d34b22695357e17b6bdd0abd83e6b1c

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT PULL] s390 updates for 5.4-rc3

2019-10-12 Thread pr-tracker-bot

The pull request you sent on Sat, 12 Oct 2019 12:25:39 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git tags/s390-5.4-4

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/f154988a905e5cad9d1a20d4c4aeb176968fe3be

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT PULL] RISC-V updates for v5.4-rc3

2019-10-12 Thread pr-tracker-bot

The pull request you sent on Sat, 12 Oct 2019 13:10:52 -0700 (PDT):

> git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git 
> tags/riscv/for-v5.4-rc3

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/48acba989ed5d8707500193048d6c4c5945d5f43

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.4-3 tag

2019-10-12 Thread pr-tracker-bot

The pull request you sent on Sat, 12 Oct 2019 22:37:15 +1100:

> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
> tags/powerpc-5.4-3

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/db60a5a035aa8692dc7cee293356bdcc078fa7b7

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

Re: clk: rockchip: Checking a kmemdup() call in rockchip_clk_register_pll()

2019-10-12 Thread Heiko Stübner

Hi Markus,

Am Samstag, 12. Oktober 2019, 15:55:44 CEST schrieb Markus Elfring:
> I tried another script for the semantic patch language out.
> This source code analysis approach points out that the implementation
> of the function “rockchip_clk_register_pll” contains also a call
> of the function “kmemdup”.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/clk/rockchip/clk-pll.c?id=1c0cc5f1ae5ee5a6913704c0d75a6e99604ee30a#n913
> https://elixir.bootlin.com/linux/v5.4-rc2/source/drivers/clk/rockchip/clk-pll.c#L913
> 
> * Do you find the usage of the format string “%s: could not allocate
>   rate table for %s\n” still appropriate at this place?

If there is an internal "no-memory" output from inside kmemdup now,
I guess the one in the clock driver would be a duplicate and could go away.

> * Is there a need to adjust the error handling here?

There is no need for additional error handling. Like if the rate-table
could not be duplicated, the clock will still report the correct clockrate
you can just not set a new rate.

And for a system it's always better to have the clock driver present
than for all device-drivers to fail probing. Especially as this start as
core clock driver, so there is no deferring possible.

Heiko

[PATCH] mm: memblock: do not enforce current limit for memblock_phys* family

2019-10-12 Thread Mike Rapoport

From: Mike Rapoport 

Until commit 92d12f9544b7 ("memblock: refactor internal allocation
functions") the maximal address for memblock allocations was forced to
memblock.current_limit only for the allocation functions returning virtual
address. The changes introduced by that commit moved the limit enforcement
into the allocation core and as a result the allocation functions returning
physical address also started to limit allocations to
memblock.current_limit.

This caused breakage of etnaviv GPU driver:

[3.682347] etnaviv etnaviv: bound 13.gpu (ops gpu_ops)
[3.688669] etnaviv etnaviv: bound 134000.gpu (ops gpu_ops)
[3.695099] etnaviv etnaviv: bound 2204000.gpu (ops gpu_ops)
[3.700800] etnaviv-gpu 13.gpu: model: GC2000, revision: 5108
[3.723013] etnaviv-gpu 13.gpu: command buffer outside valid
memory window
[3.731308] etnaviv-gpu 134000.gpu: model: GC320, revision: 5007
[3.752437] etnaviv-gpu 134000.gpu: command buffer outside valid
memory window
[3.760583] etnaviv-gpu 2204000.gpu: model: GC355, revision: 1215
[3.766766] etnaviv-gpu 2204000.gpu: Ignoring GPU with VG and FE2.0

Restore the behaviour of memblock_phys* family so that these functions will
not enforce memblock.current_limit.

Fixes: 92d12f9544b7 ("memblock: refactor internal allocation functions")
Reported-by: Adam Ford 
Tested-by: Adam Ford  #imx6q-logicpd
Signed-off-by: Mike Rapoport 
---
 mm/memblock.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/memblock.c b/mm/memblock.c
index 7d4f61a..c4b16ca 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1356,9 +1356,6 @@ static phys_addr_t __init 
memblock_alloc_range_nid(phys_addr_t size,
align = SMP_CACHE_BYTES;
}
 
-   if (end > memblock.current_limit)
-   end = memblock.current_limit;
-
 again:
found = memblock_find_in_range_node(size, align, start, end, nid,
flags);
@@ -1469,6 +1466,9 @@ static void * __init memblock_alloc_internal(
if (WARN_ON_ONCE(slab_is_available()))
return kzalloc_node(size, GFP_NOWAIT, nid);
 
+   if (max_addr > memblock.current_limit)
+   max_addr = memblock.current_limit;
+
alloc = memblock_alloc_range_nid(size, align, min_addr, max_addr, nid);
 
/* retry allocation without lower limit */
-- 
2.7.4

Re: [PATCH v5 bpf-next 09/15] samples/bpf: use own flags but not HOSTCFLAGS

2019-10-12 Thread Ivan Khoronzhuk


On Fri, Oct 11, 2019 at 02:16:05PM +0300, Sergei Shtylyov wrote:

On 10/11/2019 12:57 PM, Ivan Khoronzhuk wrote:


While compiling natively, the host's cflags and ldflags are equal to
ones used from HOSTCFLAGS and HOSTLDFLAGS. When cross compiling it
should have own, used for target arch. While verification, for arm,


  While verifying.

While verification stage.


  While *in* verification stage, "while" doesn't combine with nouns w/o
a preposition.



Sergei, better add me in cc list when msg is to me I can miss it.

Regarding the language lesson, thanks, I will keep it in mind next
time, but the issue is not rude, if it's an issue at all, so I better
leave it as is, as not reasons to correct it w/o code changes and
everyone is able to understand it.




arm64 and x86_64 the following flags were used always:

-Wall -O2
-fomit-frame-pointer
-Wmissing-prototypes
-Wstrict-prototypes

So, add them as they were verified and used before adding
Makefile.target and lets omit "-fomit-frame-pointer" as were proposed
while review, as no sense in such optimization for samples.

Signed-off-by: Ivan Khoronzhuk 

[...]


MBR, Sergei


--
Regards,
Ivan Khoronzhuk

Re: [PATCH v3 2/2] iio: (bma400) add driver for the BMA400

2019-10-12 Thread Dan Robertson

>
> No comment other than thank you for ignoring my previous comments.  :(
>

Oh, no! Sorry, it looks like I missed you in the To and CC list in my reply :/

Cheers,

 - Dan


signature.asc
Description: Digital signature

Re: Linux 5.3.6

2019-10-12 Thread Gabriel C

Am Sa., 12. Okt. 2019 um 21:16 Uhr schrieb Chris Clayton
:
>
>
> > I'm announcing the release of the 5.3.6 kernel.
>
>
> 5.3.6 build fails here with:
>
> arch/x86/entry/vdso/vdso64.so.dbg: undefined symbols found
>   CC  arch/x86/kernel/cpu/mce/threshold.o
> make[3]: *** [arch/x86/entry/vdso/Makefile:59: 
> arch/x86/entry/vdso/vdso64.so.dbg] Error 1
> make[3]: *** Deleting file 'arch/x86/entry/vdso/vdso64.so.dbg'
> make[2]: *** [scripts/Makefile.build:497: arch/x86/entry/vdso] Error 2
> make[1]: *** [scripts/Makefile.build:497: arch/x86/entry] Error 2
> make[1]: *** Waiting for unfinished jobs
>

What is your default linker ?

Also does make LD=ld.bfd fixes that for you ?

See https://bugzilla.kernel.org/show_bug.cgi?id=204951

BR,

Gabriel C.

Re: [RFC PATCH net] net: phy: Fix "link partner" information disappear issue

2019-10-12 Thread Heiner Kallweit

On 11.10.2019 07:55, Yonglong Liu wrote:
> 
> 
> On 2019/10/11 3:17, Heiner Kallweit wrote:
>> On 10.10.2019 11:30, Yonglong Liu wrote:
>>> Some drivers just call phy_ethtool_ksettings_set() to set the
>>> links, for those phy drivers that use genphy_read_status(), if
>>> autoneg is on, and the link is up, than execute "ethtool -s
>>> ethx autoneg on" will cause "link partner" information disappear.
>>>
>>> The call trace is phy_ethtool_ksettings_set()->phy_start_aneg()
>>> ->linkmode_zero(phydev->lp_advertising)->genphy_read_status(),
>>> the link didn't change, so genphy_read_status() just return, and
>>> phydev->lp_advertising is zero now.
>>>
>> I think that clearing link partner advertising info in
>> phy_start_aneg() is questionable. If advertising doesn't change
>> then phy_config_aneg() basically is a no-op. Instead we may have
>> to clear the link partner advertising info in genphy_read_lpa()
>> if aneg is disabled or aneg isn't completed (basically the same
>> as in genphy_c45_read_lpa()). Something like:
>>
>> if (!phydev->autoneg_complete) { /* also covers case that aneg is disabled */
>>  linkmode_zero(phydev->lp_advertising);
>> } else if (phydev->autoneg == AUTONEG_ENABLE) {
>>  ...
>> }
>>
> 
> If clear the link partner advertising info in genphy_read_lpa() and
> genphy_c45_read_lpa(), for the drivers that use genphy_read_status()
> is ok, but for those drivers that use there own read_status() may
> have problem, like aqr_read_status(), it will update lp_advertising
> first, and than call genphy_c45_read_status(), so will cause
> lp_advertising lost.
> 
Right, in genphy_read_lpa() we shouldn't clear all lpa bits but only
those ones the generic functions care about. Basically the same as
in the c45 version. Then a vendor-specific part isn't affected.

aqr_read_status() is a good example. It deals with 1Gbps mode that
isn't covered by the generic c45 functions. Therefore the 1Gbps-related
bits won't be overwritten by the generic functions.

> Another question, please see genphy_c45_read_status(), if clear the
> link partner advertising info in genphy_c45_read_lpa(), if autoneg is
> off, phydev->lp_advertising will not clear.
> 

If autoneg is off, lp_advertising should never be set, so there's
nothing to clear. However we may have to look at the case that user
switches to fixed speed mode via ethtool.

>>> This patch call genphy_read_lpa() before the link state judgement
>>> to fix this problem.
>>>
>>> Fixes: 88d6272acaaa ("net: phy: avoid unneeded MDIO reads in 
>>> genphy_read_status")
>>> Signed-off-by: Yonglong Liu 
>>> ---
>>>  drivers/net/phy/phy_device.c | 8 
>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>>> index 9d2bbb1..ef3073c 100644
>>> --- a/drivers/net/phy/phy_device.c
>>> +++ b/drivers/net/phy/phy_device.c
>>> @@ -1839,6 +1839,10 @@ int genphy_read_status(struct phy_device *phydev)
>>> if (err)
>>> return err;
>>>  
>>> +   err = genphy_read_lpa(phydev);
>>> +   if (err < 0)
>>> +   return err;
>>> +
>>> /* why bother the PHY if nothing can have changed */
>>> if (phydev->autoneg == AUTONEG_ENABLE && old_link && phydev->link)
>>> return 0;
>>> @@ -1848,10 +1852,6 @@ int genphy_read_status(struct phy_device *phydev)
>>> phydev->pause = 0;
>>> phydev->asym_pause = 0;
>>>  
>>> -   err = genphy_read_lpa(phydev);
>>> -   if (err < 0)
>>> -   return err;
>>> -
>>> if (phydev->autoneg == AUTONEG_ENABLE && phydev->autoneg_complete) {
>>> phy_resolve_aneg_linkmode(phydev);
>>> } else if (phydev->autoneg == AUTONEG_DISABLE) {
>>>
>>
>>
>> .
>>
> 
>

Re: [PATCH RFC v1 0/2] vhost: ring format independence

2019-10-12 Thread Michael S. Tsirkin

On Sat, Oct 12, 2019 at 03:31:50PM +0800, Jason Wang wrote:
> 
> On 2019/10/11 下午9:45, Michael S. Tsirkin wrote:
> > So the idea is as follows: we convert descriptors to an
> > independent format first, and process that converting to
> > iov later.
> > 
> > The point is that we have a tight loop that fetches
> > descriptors, which is good for cache utilization.
> > This will also allow all kind of batching tricks -
> > e.g. it seems possible to keep SMAP disabled while
> > we are fetching multiple descriptors.
> > 
> > And perhaps more importantly, this is a very good fit for the packed
> > ring layout, where we get and put descriptors in order.
> > 
> > This patchset seems to already perform exactly the same as the original
> > code already based on a microbenchmark.  More testing would be very much
> > appreciated.
> > 
> > Biggest TODO before this first step is ready to go in is to
> > batch indirect descriptors as well.
> > 
> > Integrating into vhost-net is basically
> > s/vhost_get_vq_desc/vhost_get_vq_desc_batch/ -
> > or add a module parameter like I did in the test module.
> 
> 
> It would be better to convert vhost_net then I can do some benchmark on
> that.
> 
> Thanks

Sure, I post a small patch that does this.

> 
> > 
> > 
> > 
> > Michael S. Tsirkin (2):
> >vhost: option to fetch descriptors through an independent struct
> >vhost: batching fetches
> > 
> >   drivers/vhost/test.c  |  19 ++-
> >   drivers/vhost/vhost.c | 333 +-
> >   drivers/vhost/vhost.h |  20 ++-
> >   3 files changed, 365 insertions(+), 7 deletions(-)
> >

Re: [PATCH RFC v1 2/2] vhost: batching fetches

2019-10-12 Thread Michael S. Tsirkin

On Sat, Oct 12, 2019 at 03:30:52PM +0800, Jason Wang wrote:
> 
> On 2019/10/11 下午9:46, Michael S. Tsirkin wrote:
> > With this patch applied, new and old code perform identically.
> > 
> > Lots of extra optimizations are now possible, e.g.
> > we can fetch multiple heads with copy_from/to_user now.
> > We can get rid of maintaining the log array.  Etc etc.
> > 
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> >   drivers/vhost/test.c  |  2 +-
> >   drivers/vhost/vhost.c | 50 ---
> >   drivers/vhost/vhost.h |  4 +++-
> >   3 files changed, 46 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
> > index 39a018a7af2d..e3a8e9db22cd 100644
> > --- a/drivers/vhost/test.c
> > +++ b/drivers/vhost/test.c
> > @@ -128,7 +128,7 @@ static int vhost_test_open(struct inode *inode, struct 
> > file *f)
> > dev = >dev;
> > vqs[VHOST_TEST_VQ] = >vqs[VHOST_TEST_VQ];
> > n->vqs[VHOST_TEST_VQ].handle_kick = handle_vq_kick;
> > -   vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX, UIO_MAXIOV,
> > +   vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX, UIO_MAXIOV + 64,
> >VHOST_TEST_PKT_WEIGHT, VHOST_TEST_WEIGHT);
> > f->private_data = n;
> > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > index 36661d6cb51f..aa383e847865 100644
> > --- a/drivers/vhost/vhost.c
> > +++ b/drivers/vhost/vhost.c
> > @@ -302,6 +302,7 @@ static void vhost_vq_reset(struct vhost_dev *dev,
> >   {
> > vq->num = 1;
> > vq->ndescs = 0;
> > +   vq->first_desc = 0;
> > vq->desc = NULL;
> > vq->avail = NULL;
> > vq->used = NULL;
> > @@ -390,6 +391,7 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev 
> > *dev)
> > for (i = 0; i < dev->nvqs; ++i) {
> > vq = dev->vqs[i];
> > vq->max_descs = dev->iov_limit;
> > +   vq->batch_descs = dev->iov_limit - UIO_MAXIOV;
> > vq->descs = kmalloc_array(vq->max_descs,
> >   sizeof(*vq->descs),
> >   GFP_KERNEL);
> > @@ -2366,6 +2368,8 @@ static void pop_split_desc(struct vhost_virtqueue *vq)
> > --vq->ndescs;
> >   }
> > +#define VHOST_DESC_FLAGS (VRING_DESC_F_INDIRECT | VRING_DESC_F_WRITE | \
> > + VRING_DESC_F_NEXT)
> >   static int push_split_desc(struct vhost_virtqueue *vq, struct vring_desc 
> > *desc, u16 id)
> >   {
> > struct vhost_desc *h;
> > @@ -2375,7 +2379,7 @@ static int push_split_desc(struct vhost_virtqueue 
> > *vq, struct vring_desc *desc,
> > h = >descs[vq->ndescs++];
> > h->addr = vhost64_to_cpu(vq, desc->addr);
> > h->len = vhost32_to_cpu(vq, desc->len);
> > -   h->flags = vhost16_to_cpu(vq, desc->flags);
> > +   h->flags = vhost16_to_cpu(vq, desc->flags) & VHOST_DESC_FLAGS;
> > h->id = id;
> > return 0;
> > @@ -2450,7 +2454,7 @@ static int fetch_indirect_descs(struct 
> > vhost_virtqueue *vq,
> > return 0;
> >   }
> > -static int fetch_descs(struct vhost_virtqueue *vq)
> > +static int fetch_buf(struct vhost_virtqueue *vq)
> >   {
> > struct vring_desc desc;
> > unsigned int i, head, found = 0;
> > @@ -2462,7 +2466,11 @@ static int fetch_descs(struct vhost_virtqueue *vq)
> > /* Check it isn't doing very strange things with descriptor numbers. */
> > last_avail_idx = vq->last_avail_idx;
> > -   if (vq->avail_idx == vq->last_avail_idx) {
> > +   if (unlikely(vq->avail_idx == vq->last_avail_idx)) {
> > +   /* If we already have work to do, don't bother re-checking. */
> > +   if (likely(vq->ndescs))
> > +   return vq->num;
> > +
> > if (unlikely(vhost_get_avail_idx(vq, _idx))) {
> > vq_err(vq, "Failed to access avail idx at %p\n",
> > >avail->idx);
> > @@ -2541,6 +2549,24 @@ static int fetch_descs(struct vhost_virtqueue *vq)
> > return 0;
> >   }
> > +static int fetch_descs(struct vhost_virtqueue *vq)
> > +{
> > +   int ret = 0;
> > +
> > +   if (unlikely(vq->first_desc >= vq->ndescs)) {
> > +   vq->first_desc = 0;
> > +   vq->ndescs = 0;
> > +   }
> > +
> > +   if (vq->ndescs)
> > +   return 0;
> > +
> > +   while (!ret && vq->ndescs <= vq->batch_descs)
> > +   ret = fetch_buf(vq);
> 
> 
> It looks to me descriptor chaining might be broken here.

It should work because fetch_buf fetches a whole buf, following
the chain. Seems to work in a small test ... what issues do you see?

> 
> > +
> > +   return vq->ndescs ? 0 : ret;
> > +}
> > +
> >   /* This looks in the virtqueue and for the first available buffer, and 
> > converts
> >* it to an iovec for convenient access.  Since descriptors consist of 
> > some
> >* number of output then some number of input descriptors, it's actually 
> > two
> > @@ -2562,6 +2588,8 @@ int vhost_get_vq_desc_batch(struct vhost_virtqueue 
> > *vq,
> > if (ret)
> > return ret;
> > +   /* Note: indirect

Re: [PATCH v3 2/2] iio: (bma400) add driver for the BMA400

2019-10-12 Thread Randy Dunlap

On 10/12/19 12:25 PM, Dan Robertson wrote:
> Add a IIO driver for the Bosch BMA400 3-axes ultra-low power accelerometer.
> The driver supports reading from the acceleration and temperature
> registers. The driver also supports reading and configuring the output data
> rate, oversampling ratio, and scale.
> 
> Signed-off-by: Dan Robertson 
> ---
>  drivers/iio/accel/Kconfig   |  18 +
>  drivers/iio/accel/Makefile  |   2 +
>  drivers/iio/accel/bma400.h  |  80 
>  drivers/iio/accel/bma400_core.c | 788 
>  drivers/iio/accel/bma400_i2c.c  |  60 +++
>  5 files changed, 948 insertions(+)
>  create mode 100644 drivers/iio/accel/bma400.h
>  create mode 100644 drivers/iio/accel/bma400_core.c
>  create mode 100644 drivers/iio/accel/bma400_i2c.c

No comment other than thank you for ignoring my previous comments.  :(

-- 
~Randy

[GIT PULL] mtd: Fixes for v5.4-rc3

2019-10-12 Thread Richard Weinberger

Linus,

The following changes since commit 54ecb8f7028c5eb3d740bb82b0f1d90f2df63c5c:

  Linux 5.4-rc1 (2019-09-30 10:35:40 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git 
tags/fixes-for-5.4-rc3

for you to fetch changes up to df8fed831cbcdce7b283b2d9c1aadadcf8940d05:

  mtd: rawnand: au1550nd: Fix au_read_buf16() prototype (2019-10-07 09:56:36 
+0200)


This pull request contains two fixes for MTD:

- spi-nor: Fix for a regression in write_sr()
- rawnand: Regression fix for the au1550nd driver


Paul Burton (1):
  mtd: rawnand: au1550nd: Fix au_read_buf16() prototype

Tudor Ambarus (1):
  mtd: spi-nor: Fix direction of the write_sr() transfer

 drivers/mtd/nand/raw/au1550nd.c | 5 ++---
 drivers/mtd/spi-nor/spi-nor.c   | 2 +-
 2 files changed, 3 insertions(+), 4 deletions(-)

Re: [PATCH RFC v1 1/2] vhost: option to fetch descriptors through an independent struct

2019-10-12 Thread Michael S. Tsirkin

On Sat, Oct 12, 2019 at 03:28:49PM +0800, Jason Wang wrote:
> 
> On 2019/10/11 下午9:45, Michael S. Tsirkin wrote:
> > The idea is to support multiple ring formats by converting
> > to a format-independent array of descriptors.
> > 
> > This costs extra cycles, but we gain in ability
> > to fetch a batch of descriptors in one go, which
> > is good for code cache locality.
> > 
> > To simplify benchmarking, I kept the old code
> > around so one can switch back and forth by
> > writing into a module parameter.
> > This will go away in the final submission.
> > 
> > This patch causes a minor performance degradation,
> > it's been kept as simple as possible for ease of review.
> > Next patch gets us back the performance by adding batching.
> > 
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> >   drivers/vhost/test.c  |  17 ++-
> >   drivers/vhost/vhost.c | 299 +-
> >   drivers/vhost/vhost.h |  16 +++
> >   3 files changed, 327 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
> > index 056308008288..39a018a7af2d 100644
> > --- a/drivers/vhost/test.c
> > +++ b/drivers/vhost/test.c
> > @@ -18,6 +18,9 @@
> >   #include "test.h"
> >   #include "vhost.h"
> > +static int newcode = 0;
> > +module_param(newcode, int, 0644);
> > +
> >   /* Max number of bytes transferred before requeueing the job.
> >* Using this limit prevents one virtqueue from starving others. */
> >   #define VHOST_TEST_WEIGHT 0x8
> > @@ -58,10 +61,16 @@ static void handle_vq(struct vhost_test *n)
> > vhost_disable_notify(>dev, vq);
> > for (;;) {
> > -   head = vhost_get_vq_desc(vq, vq->iov,
> > -ARRAY_SIZE(vq->iov),
> > -, ,
> > -NULL, NULL);
> > +   if (newcode)
> > +   head = vhost_get_vq_desc_batch(vq, vq->iov,
> > +  ARRAY_SIZE(vq->iov),
> > +  , ,
> > +  NULL, NULL);
> > +   else
> > +   head = vhost_get_vq_desc(vq, vq->iov,
> > +ARRAY_SIZE(vq->iov),
> > +, ,
> > +NULL, NULL);
> > /* On error, stop handling until the next kick. */
> > if (unlikely(head < 0))
> > break;
> > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > index 36ca2cf419bf..36661d6cb51f 100644
> > --- a/drivers/vhost/vhost.c
> > +++ b/drivers/vhost/vhost.c
> > @@ -301,6 +301,7 @@ static void vhost_vq_reset(struct vhost_dev *dev,
> >struct vhost_virtqueue *vq)
> >   {
> > vq->num = 1;
> > +   vq->ndescs = 0;
> > vq->desc = NULL;
> > vq->avail = NULL;
> > vq->used = NULL;
> > @@ -369,6 +370,9 @@ static int vhost_worker(void *data)
> >   static void vhost_vq_free_iovecs(struct vhost_virtqueue *vq)
> >   {
> > +   kfree(vq->descs);
> > +   vq->descs = NULL;
> > +   vq->max_descs = 0;
> > kfree(vq->indirect);
> > vq->indirect = NULL;
> > kfree(vq->log);
> > @@ -385,6 +389,10 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev 
> > *dev)
> > for (i = 0; i < dev->nvqs; ++i) {
> > vq = dev->vqs[i];
> > +   vq->max_descs = dev->iov_limit;
> > +   vq->descs = kmalloc_array(vq->max_descs,
> > + sizeof(*vq->descs),
> > + GFP_KERNEL);
> 
> 
> Is iov_limit too much here? It can obviously increase the footprint. I guess
> the batching can only be done for descriptor without indirect or next set.
> Then we may batch 16 or 64.
> 
> Thanks

Yes, next patch only batches up to 64.  But we do need iov_limit because
guest can pass a long chain of scatter/gather.
We already have iovecs in a huge array so this does not look like
a big deal. If we ever teach the code to avoid the huge
iov arrays by handling huge s/g lists piece by piece,
we can make the desc array smaller at the same point.

[GIT PULL] RISC-V updates for v5.4-rc3

2019-10-12 Thread Paul Walmsley

Linus,

The following changes since commit da0c9ea146cbe92b832f1b0f694840ea8eb33cce:

  Linux 5.4-rc2 (2019-10-06 14:27:30 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git 
tags/riscv/for-v5.4-rc3

for you to fetch changes up to cd9e72b80090a8cd7d84a47a30a06fa92ff277d1:

  RISC-V: entry: Remove unneeded need_resched() loop (2019-10-09 16:48:27 -0700)


RISC-V updates for v5.4-rc3

Some RISC-V fixes for v5.4-rc3:

- Fix several bugs in the breakpoint trap handler

- Drop an unnecessary loop around calls to preempt_schedule_irq()


Valentin Schneider (1):
  RISC-V: entry: Remove unneeded need_resched() loop

Vincent Chen (3):
  riscv: avoid kernel hangs when trapped in BUG()
  riscv: avoid sending a SIGTRAP to a user thread trapped in WARN()
  riscv: Correct the handling of unexpected ebreak in do_trap_break()

 arch/riscv/kernel/entry.S |  3 +--
 arch/riscv/kernel/traps.c | 14 +++---
 2 files changed, 8 insertions(+), 9 deletions(-)

[PATCH] arm64: dts: sun50i: sopine-baseboard: Expose serial1, serial2 and serial3

2019-10-12 Thread Alistair Francis

Follow what the sun50i-a64-pine64.dts does and expose all 5 serial
connections.

Signed-off-by: Alistair Francis 
---
 .../allwinner/sun50i-a64-sopine-baseboard.dts | 25 +++
 1 file changed, 25 insertions(+)

diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-sopine-baseboard.dts 
b/arch/arm64/boot/dts/allwinner/sun50i-a64-sopine-baseboard.dts
index 124b0b030b28..49c37b21ab36 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64-sopine-baseboard.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-sopine-baseboard.dts
@@ -56,6 +56,10 @@
aliases {
ethernet0 = 
serial0 = 
+   serial1 = 
+   serial2 = 
+   serial3 = 
+   serial4 = 
};
 
chosen {
@@ -280,6 +284,27 @@
};
 };
 
+/* On Pi-2 connector */
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+   status = "disabled";
+};
+
+/* On Euler connector */
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+   status = "disabled";
+};
+
+/* On Euler connector, RTS/CTS optional */
+ {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+   status = "disabled";
+};
+
 _otg {
dr_mode = "host";
status = "okay";
-- 
2.23.0

[PATCH v3 1/2] dt-bindings: iio: accel: bma400: add bindings

2019-10-12 Thread Dan Robertson

Add devicetree binding for the Bosch BMA400 3-axes ultra-low power
accelerometer sensor.

Signed-off-by: Dan Robertson 
---
 .../devicetree/bindings/iio/accel/bma400.yaml | 39 +++
 1 file changed, 39 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iio/accel/bma400.yaml

diff --git a/Documentation/devicetree/bindings/iio/accel/bma400.yaml 
b/Documentation/devicetree/bindings/iio/accel/bma400.yaml
new file mode 100644
index ..31dceac89ace
--- /dev/null
+++ b/Documentation/devicetree/bindings/iio/accel/bma400.yaml
@@ -0,0 +1,39 @@
+# SPDX-License-Identifier: GPL-2.0
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/iio/accel/bma400.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Bosch BMA400 triaxial acceleration sensor
+
+maintainers:
+  - Dan Robertson 
+
+description: |
+  Acceleration and temerature iio sensors with an i2c interface
+
+  Specifications about the sensor can be found at:
+
https://ae-bst.resource.bosch.com/media/_tech/media/datasheets/BST-BMA400-DS000.pdf
+
+properties:
+  compatible:
+enum:
+  - bosch,bma400
+
+  reg:
+maxItems: 1
+
+required:
+  - compatible
+  - reg
+
+examples:
+  - |
+i2c0 {
+  #address-cells = <1>;
+  #size-cells = <1>;
+  bma400@14 {
+compatible = "bosch,bma400";
+reg = <0x14>;
+  };
+};
-- 
2.23.0

[PATCH v3 2/2] iio: (bma400) add driver for the BMA400

2019-10-12 Thread Dan Robertson

Add a IIO driver for the Bosch BMA400 3-axes ultra-low power accelerometer.
The driver supports reading from the acceleration and temperature
registers. The driver also supports reading and configuring the output data
rate, oversampling ratio, and scale.

Signed-off-by: Dan Robertson 
---
 drivers/iio/accel/Kconfig   |  18 +
 drivers/iio/accel/Makefile  |   2 +
 drivers/iio/accel/bma400.h  |  80 
 drivers/iio/accel/bma400_core.c | 788 
 drivers/iio/accel/bma400_i2c.c  |  60 +++
 5 files changed, 948 insertions(+)
 create mode 100644 drivers/iio/accel/bma400.h
 create mode 100644 drivers/iio/accel/bma400_core.c
 create mode 100644 drivers/iio/accel/bma400_i2c.c

diff --git a/drivers/iio/accel/Kconfig b/drivers/iio/accel/Kconfig
index 9b9656ce37e6..a1081b902d16 100644
--- a/drivers/iio/accel/Kconfig
+++ b/drivers/iio/accel/Kconfig
@@ -112,6 +112,24 @@ config BMA220
  To compile this driver as a module, choose M here: the
  module will be called bma220_spi.
 
+config BMA400
+   tristate "Bosch BMA400 3-Axis Accelerometer Driver"
+   depends on I2C
+   select REGMAP
+   select BMA400_I2C if (I2C)
+   help
+ Say Y here if you want to build a driver for the Bosch BMA400
+ triaxial acceleration sensor.
+
+ To compile this driver as a module, choose M here: the
+ module will be called bma400_core and you will also get
+ bma400_i2c for I2C.
+
+config BMA400_I2C
+   tristate
+   depends on BMA400
+   select REGMAP_I2C
+
 config BMC150_ACCEL
tristate "Bosch BMC150 Accelerometer Driver"
select IIO_BUFFER
diff --git a/drivers/iio/accel/Makefile b/drivers/iio/accel/Makefile
index 56bd0215e0d4..3a051cf37f40 100644
--- a/drivers/iio/accel/Makefile
+++ b/drivers/iio/accel/Makefile
@@ -14,6 +14,8 @@ obj-$(CONFIG_ADXL372_I2C) += adxl372_i2c.o
 obj-$(CONFIG_ADXL372_SPI) += adxl372_spi.o
 obj-$(CONFIG_BMA180) += bma180.o
 obj-$(CONFIG_BMA220) += bma220_spi.o
+obj-$(CONFIG_BMA400) += bma400_core.o
+obj-$(CONFIG_BMA400_I2C) += bma400_i2c.o
 obj-$(CONFIG_BMC150_ACCEL) += bmc150-accel-core.o
 obj-$(CONFIG_BMC150_ACCEL_I2C) += bmc150-accel-i2c.o
 obj-$(CONFIG_BMC150_ACCEL_SPI) += bmc150-accel-spi.o
diff --git a/drivers/iio/accel/bma400.h b/drivers/iio/accel/bma400.h
new file mode 100644
index ..e5fa57d1b97a
--- /dev/null
+++ b/drivers/iio/accel/bma400.h
@@ -0,0 +1,80 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * bma400.h - Register constants and other forward declarations
+ *needed by the bma400 sources.
+ *
+ * Copyright 2019 Dan Robertson 
+ */
+
+#include 
+
+/*
+ * Read-Only Registers
+ */
+
+/* Status and ID registers */
+#define BMA400_CHIP_ID_REG  0x00
+#define BMA400_ERR_REG  0x02
+#define BMA400_STATUS_REG   0x03
+
+/* Acceleration registers */
+#define BMA400_X_AXIS_LSB_REG   0x04
+#define BMA400_X_AXIS_MSB_REG   0x05
+#define BMA400_Y_AXIS_LSB_REG   0x06
+#define BMA400_Y_AXIS_MSB_REG   0x07
+#define BMA400_Z_AXIS_LSB_REG   0x08
+#define BMA400_Z_AXIS_MSB_REG   0x09
+
+/* Sensor time registers */
+#define BMA400_SENSOR_TIME0 0x0a
+#define BMA400_SENSOR_TIME1 0x0b
+#define BMA400_SENSOR_TIME2 0x0c
+
+/* Event and interrupt registers */
+#define BMA400_EVENT_REG0x0d
+#define BMA400_INT_STAT0_REG0x0e
+#define BMA400_INT_STAT1_REG0x0f
+#define BMA400_INT_STAT2_REG0x10
+
+/* Temperature register */
+#define BMA400_TEMP_DATA_REG0x11
+
+/* FIFO length and data registers */
+#define BMA400_FIFO_LENGTH0_REG 0x12
+#define BMA400_FIFO_LENGTH1_REG 0x13
+#define BMA400_FIFO_DATA_REG0x14
+
+/* Step count registers */
+#define BMA400_STEP_CNT0_REG0x15
+#define BMA400_STEP_CNT1_REG0x16
+#define BMA400_STEP_CNT3_REG0x17
+#define BMA400_STEP_STAT_REG0x18
+
+/*
+ * Read-write configuration registers
+ */
+#define BMA400_ACC_CONFIG0_REG  0x19
+#define BMA400_ACC_CONFIG1_REG  0x1a
+#define BMA400_ACC_CONFIG2_REG  0x1b
+#define BMA400_CMD_REG  0x7e
+
+/* Chip ID of BMA 400 devices found in the chip ID register. */
+#define BMA400_ID_REG_VAL   0x90
+
+#define BMA400_TWO_BITS_MASK0x03
+#define BMA400_LP_OSR_MASK  0x60
+#define BMA400_NP_OSR_MASK  0x30
+#define BMA400_ACC_ODR_MASK 0x0f
+#define BMA400_ACC_SCALE_MASK   0xc0
+
+#define BMA400_LP_OSR_SHIFT 0x05
+#define BMA400_NP_OSR_SHIFT 0x04
+#define BMA400_SCALE_SHIFT  0x06
+
+extern const struct regmap_config bma400_regmap_config;
+
+int bma400_probe(struct device *dev,
+struct regmap *regmap,
+const char *name);
+
+int bma400_remove(struct device *dev);
diff --git a/drivers/iio/accel/bma400_core.c b/drivers/iio/accel/bma400_core.c
new file mode 100644
index ..1b19e69686ad
--- /dev/null
+++ b/drivers/iio/accel/bma400_core.c
@@ -0,0 +1,788 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * bma400_core.c - Core IIO driver for Bosch BMA400 triaxial

[PATCH v3 0/2] iio: add driver for Bosch BMA400 accelerometer

2019-10-12 Thread Dan Robertson

This patchset adds a IIO driver for the Bosch BMA400 3-axes ultra low-power
accelerometer.  The initial implementation of the driver adds read support for
the acceleration and temperature data registers. The driver also has support
for reading and writing to the output data rate, oversampling ratio, and scale
configuration registers.

Version 3 implements the feedback from reviewers of the v2 patchset.

Cheers,

 - Dan

Changes in v3:

 * Use yaml format for DT bindings
 * Remove strict dependency on OF
 * Tidy Kconfig dependencies
 * Stylistic changes
 * Do not soft-reset device on remove

Changes in v2:

 * Implemented iio_info -> read_avail
 * Stylistic changes
 * Implemented devicetree bindings

Dan Robertson (2):
  dt-bindings: iio: accel: bma400: add bindings
  iio: (bma400) add driver for the BMA400

 .../devicetree/bindings/iio/accel/bma400.yaml |  39 +
 drivers/iio/accel/Kconfig |  18 +
 drivers/iio/accel/Makefile|   2 +
 drivers/iio/accel/bma400.h|  80 ++
 drivers/iio/accel/bma400_core.c   | 788 ++
 drivers/iio/accel/bma400_i2c.c|  60 ++
 6 files changed, 987 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iio/accel/bma400.yaml
 create mode 100644 drivers/iio/accel/bma400.h
 create mode 100644 drivers/iio/accel/bma400_core.c
 create mode 100644 drivers/iio/accel/bma400_i2c.c

-- 
2.23.0

Re: [PATCH net] net: sched: act_mirred: drop skb's dst_entry in ingress redirection

2019-10-12 Thread Sergei Shtylyov

Hello!

On 10/12/2019 10:16 AM, Zhiyuan Hou wrote:

> In act_mirred's ingress redirection, if the skb's dst_entry is valid
> when call function netif_receive_skb, the fllowing l3 stack process

  Following or flowing?

> (ip_rcv_finish_core) will check dst_entry and skip the routing
> decision. Using the old dst_entry is unexpected and may discard the
> skb in some case. For example dst->dst_input points to dst_discard.
> 
> This patch drops the skb's dst_entry before calling netif_receive_skb
> so that the skb can be made routing decision like a normal ingress
> skb.
> 
> Signed-off-by: Zhiyuan Hou 
> ---
>  net/sched/act_mirred.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
> index 9ce073a05414..6108a64c0cd5 100644
> --- a/net/sched/act_mirred.c
> +++ b/net/sched/act_mirred.c
[...]
> @@ -298,8 +299,10 @@ static int tcf_mirred_act(struct sk_buff *skb, const 
> struct tc_action *a,
>  
>   if (!want_ingress)
>   err = dev_queue_xmit(skb2);
> - else
> + else {
> + skb_dst_drop(skb2);
>   err = netif_receive_skb(skb2);
> + }

   If you introduce {} in one *if* branch, {} should be added to the other 
branches as well,
says CodingStyle.

[...]

MBR, Sergei

Re: [PATCH RFC v1 0/2] vhost: ring format independence

2019-10-12 Thread Michael S. Tsirkin

On Sat, Oct 12, 2019 at 04:15:42PM +0800, Jason Wang wrote:
> 
> On 2019/10/11 下午9:45, Michael S. Tsirkin wrote:
> > So the idea is as follows: we convert descriptors to an
> > independent format first, and process that converting to
> > iov later.
> > 
> > The point is that we have a tight loop that fetches
> > descriptors, which is good for cache utilization.
> > This will also allow all kind of batching tricks -
> > e.g. it seems possible to keep SMAP disabled while
> > we are fetching multiple descriptors.
> 
> 
> I wonder this may help for performance:

Could you try it out and report please?
Would be very much appreciated.

> - another indirection layer, increased footprint

Seems to be offset off by improved batching.
For sure will be even better if we can move stac/clac out,
or replace some get/put user with bigger copy to/from.

> - won't help or even degrade when there's no batch

I couldn't measure a difference. I'm guessing

> - an extra overhead in the case of in order where we should already had
> tight loop

it's not so tight with translation in there.
this exactly makes the loop tight.

> - need carefully deal with indirect and chain or make it only work for
> packet sit just in a single descriptor
> 
> Thanks

I don't understand this last comment.

> 
> > 
> > And perhaps more importantly, this is a very good fit for the packed
> > ring layout, where we get and put descriptors in order.
> > 
> > This patchset seems to already perform exactly the same as the original
> > code already based on a microbenchmark.  More testing would be very much
> > appreciated.
> > 
> > Biggest TODO before this first step is ready to go in is to
> > batch indirect descriptors as well.
> > 
> > Integrating into vhost-net is basically
> > s/vhost_get_vq_desc/vhost_get_vq_desc_batch/ -
> > or add a module parameter like I did in the test module.
> > 
> > 
> > 
> > Michael S. Tsirkin (2):
> >vhost: option to fetch descriptors through an independent struct
> >vhost: batching fetches
> > 
> >   drivers/vhost/test.c  |  19 ++-
> >   drivers/vhost/vhost.c | 333 +-
> >   drivers/vhost/vhost.h |  20 ++-
> >   3 files changed, 365 insertions(+), 7 deletions(-)
> >

GOOD DAY?

2019-10-12 Thread Mr. Henk Boelens

Western Associate Bank
Bank Address:Tower Building 83 Hull Road
Oxwich Brussels Belgium

Dear Friend

Please accept my apologies if this request does not meet your personal ethics 
as it is not intended to cause you any embarrassment in what ever form. I got 
your
contact email address from the internet directory and decided to contact you 
for this transaction that is based on trust and your outstanding. I have an 
interesting business proposal for you that will be of immense benefit to both 
of us. Although this may be hard for you to believe because i know that there 
is absolutely going to be a great doubt and distrust in your heart in respect 
of this email as this might sound strange to you and coupled with the fact 
that, so many individuals have taken possession of the Internet to facilitate 
their nefarious deeds, thereby making it extremely difficult for genuine and 
legitimate persons to get attention and recognition. Please grant me the 
benefit of doubt and hear me out.

My name is Henk Boelens . I work with Western Associate Bank here in Belgium as 
a branch bank manager. I discovered an abandoned sum of GBP 19,850,000.00 
(Nineteen Million Eight Hundred And Fifty Thousand British Pounds) in an 
account that belongs to one of our foreign customers Late Dr. Erin Jacobson, an 
American citizen who unfortunately lost his life and his entire family in 
Montana plane crash on March 23, 2009, on their way to a group ski vacation. 
The choice of contacting you is aroused from the geographical nature of where 
you live, particularly due to the sensitivity of this transaction and the 
confidentiality herein. Now our bank has been waiting for any of the relatives 
to come up for the claim but nobody has done that. I personally tried to locate 
any member of his family but have been unsuccessful in locating the relatives 
for 7 years now, i have also checked the deposit documents and discovered that 
he did not declare any next of kin on the deposit.

Now the Management of our bank as instructed me to look for the next of kin or 
they will convert the funds into the Government Treasury Account as unclaimed 
funds and the funds will be wasted. Therefore, I cannot claim these funds 
without presenting a foreigner to stand as next of kin. This is reason why I 
contacted you to seek your consent to present you as an next of kin so that the 
funds will be release to you, then we share it 55% for me and 45% for you 
because am not a
greedy person and is deal between me and you.

I have employed the service of an Attorney who will secure all necessary legal 
documents that could be used to back up this claim. All the attorney need to do 
is to fill in your names to the documents and legalize it in the Court here to 
prove you as the legitimate next of kin to the late depositor Dr. Erin Jacobson 
then the bank will release the funds to you as the rightful beneficiary.

This is a fair deal without any risk attached either on your part or on my part 
as long as we comply with the laws governing the claiming of funds in our 
establishment. All I require is your honest co-operation to enable us see this 
deal through, and with my position in the bank as a bank manager, I will do 
every thing possible to protect your interest and to make sure everything 
workout successfully.

If you are interested in this deal, kindly send me your complete information, 
your full names and address, Your Private telephone and Fax numbers, and Cell 
phone so that the attorney will start processing the necessary paperwork that 
would facilitate the release of the funds to you.

Mr. Henk Boelens

[PATCH RFC v2 1/2] vhost: option to fetch descriptors through an independent struct

2019-10-12 Thread Michael S. Tsirkin

The idea is to support multiple ring formats by converting
to a format-independent array of descriptors.

This costs extra cycles, but we gain in ability
to fetch a batch of descriptors in one go, which
is good for code cache locality.

To simplify benchmarking, I kept the old code
around so one can switch back and forth by
writing into a module parameter.
This will go away in the final submission.

This patch causes a minor performance degradation,
it's been kept as simple as possible for ease of review.
Next patch gets us back the performance by adding batching.

Signed-off-by: Michael S. Tsirkin 
---
 drivers/vhost/test.c  |  17 ++-
 drivers/vhost/vhost.c | 299 +-
 drivers/vhost/vhost.h |  16 +++
 3 files changed, 327 insertions(+), 5 deletions(-)

diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 056308008288..39a018a7af2d 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -18,6 +18,9 @@
 #include "test.h"
 #include "vhost.h"
 
+static int newcode = 0;
+module_param(newcode, int, 0644);
+
 /* Max number of bytes transferred before requeueing the job.
  * Using this limit prevents one virtqueue from starving others. */
 #define VHOST_TEST_WEIGHT 0x8
@@ -58,10 +61,16 @@ static void handle_vq(struct vhost_test *n)
vhost_disable_notify(>dev, vq);
 
for (;;) {
-   head = vhost_get_vq_desc(vq, vq->iov,
-ARRAY_SIZE(vq->iov),
-, ,
-NULL, NULL);
+   if (newcode)
+   head = vhost_get_vq_desc_batch(vq, vq->iov,
+  ARRAY_SIZE(vq->iov),
+  , ,
+  NULL, NULL);
+   else
+   head = vhost_get_vq_desc(vq, vq->iov,
+ARRAY_SIZE(vq->iov),
+, ,
+NULL, NULL);
/* On error, stop handling until the next kick. */
if (unlikely(head < 0))
break;
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 36ca2cf419bf..36661d6cb51f 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -301,6 +301,7 @@ static void vhost_vq_reset(struct vhost_dev *dev,
   struct vhost_virtqueue *vq)
 {
vq->num = 1;
+   vq->ndescs = 0;
vq->desc = NULL;
vq->avail = NULL;
vq->used = NULL;
@@ -369,6 +370,9 @@ static int vhost_worker(void *data)
 
 static void vhost_vq_free_iovecs(struct vhost_virtqueue *vq)
 {
+   kfree(vq->descs);
+   vq->descs = NULL;
+   vq->max_descs = 0;
kfree(vq->indirect);
vq->indirect = NULL;
kfree(vq->log);
@@ -385,6 +389,10 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev)
 
for (i = 0; i < dev->nvqs; ++i) {
vq = dev->vqs[i];
+   vq->max_descs = dev->iov_limit;
+   vq->descs = kmalloc_array(vq->max_descs,
+ sizeof(*vq->descs),
+ GFP_KERNEL);
vq->indirect = kmalloc_array(UIO_MAXIOV,
 sizeof(*vq->indirect),
 GFP_KERNEL);
@@ -392,7 +400,7 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev)
GFP_KERNEL);
vq->heads = kmalloc_array(dev->iov_limit, sizeof(*vq->heads),
  GFP_KERNEL);
-   if (!vq->indirect || !vq->log || !vq->heads)
+   if (!vq->indirect || !vq->log || !vq->heads || !vq->descs)
goto err_nomem;
}
return 0;
@@ -2346,6 +2354,295 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
 }
 EXPORT_SYMBOL_GPL(vhost_get_vq_desc);
 
+static struct vhost_desc *peek_split_desc(struct vhost_virtqueue *vq)
+{
+   BUG_ON(!vq->ndescs);
+   return >descs[vq->ndescs - 1];
+}
+
+static void pop_split_desc(struct vhost_virtqueue *vq)
+{
+   BUG_ON(!vq->ndescs);
+   --vq->ndescs;
+}
+
+static int push_split_desc(struct vhost_virtqueue *vq, struct vring_desc 
*desc, u16 id)
+{
+   struct vhost_desc *h;
+
+   if (unlikely(vq->ndescs >= vq->max_descs))
+   return -EINVAL;
+   h = >descs[vq->ndescs++];
+   h->addr = vhost64_to_cpu(vq, desc->addr);
+   h->len = vhost32_to_cpu(vq, desc->len);
+   h->flags = vhost16_to_cpu(vq, desc->flags);
+   h->id = id;
+
+   return 0;
+}
+
+static int fetch_indirect_descs(struct vhost_virtqueue *vq,
+   struct vhost_desc *indirect,
+   u16 head)
+{
+

[PATCH RFC v2 2/2] vhost: batching fetches

2019-10-12 Thread Michael S. Tsirkin

With this patch applied, new and old code perform identically.

Lots of extra optimizations are now possible, e.g.
we can fetch multiple heads with copy_from/to_user now.
We can get rid of maintaining the log array.  Etc etc.

Signed-off-by: Michael S. Tsirkin 
---
 drivers/vhost/test.c  |  2 +-
 drivers/vhost/vhost.c | 50 ---
 drivers/vhost/vhost.h |  4 +++-
 3 files changed, 46 insertions(+), 10 deletions(-)

diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 39a018a7af2d..e3a8e9db22cd 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -128,7 +128,7 @@ static int vhost_test_open(struct inode *inode, struct file 
*f)
dev = >dev;
vqs[VHOST_TEST_VQ] = >vqs[VHOST_TEST_VQ];
n->vqs[VHOST_TEST_VQ].handle_kick = handle_vq_kick;
-   vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX, UIO_MAXIOV,
+   vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX, UIO_MAXIOV + 64,
   VHOST_TEST_PKT_WEIGHT, VHOST_TEST_WEIGHT);
 
f->private_data = n;
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 36661d6cb51f..50d4a148d60d 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -302,6 +302,7 @@ static void vhost_vq_reset(struct vhost_dev *dev,
 {
vq->num = 1;
vq->ndescs = 0;
+   vq->first_desc = 0;
vq->desc = NULL;
vq->avail = NULL;
vq->used = NULL;
@@ -390,6 +391,7 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev)
for (i = 0; i < dev->nvqs; ++i) {
vq = dev->vqs[i];
vq->max_descs = dev->iov_limit;
+   vq->batch_descs = dev->iov_limit - UIO_MAXIOV;
vq->descs = kmalloc_array(vq->max_descs,
  sizeof(*vq->descs),
  GFP_KERNEL);
@@ -2366,6 +2368,8 @@ static void pop_split_desc(struct vhost_virtqueue *vq)
--vq->ndescs;
 }
 
+#define VHOST_DESC_FLAGS (VRING_DESC_F_INDIRECT | VRING_DESC_F_WRITE | \
+ VRING_DESC_F_NEXT)
 static int push_split_desc(struct vhost_virtqueue *vq, struct vring_desc 
*desc, u16 id)
 {
struct vhost_desc *h;
@@ -2375,7 +2379,7 @@ static int push_split_desc(struct vhost_virtqueue *vq, 
struct vring_desc *desc,
h = >descs[vq->ndescs++];
h->addr = vhost64_to_cpu(vq, desc->addr);
h->len = vhost32_to_cpu(vq, desc->len);
-   h->flags = vhost16_to_cpu(vq, desc->flags);
+   h->flags = vhost16_to_cpu(vq, desc->flags) & VHOST_DESC_FLAGS;
h->id = id;
 
return 0;
@@ -2450,7 +2454,7 @@ static int fetch_indirect_descs(struct vhost_virtqueue 
*vq,
return 0;
 }
 
-static int fetch_descs(struct vhost_virtqueue *vq)
+static int fetch_buf(struct vhost_virtqueue *vq)
 {
struct vring_desc desc;
unsigned int i, head, found = 0;
@@ -2462,7 +2466,11 @@ static int fetch_descs(struct vhost_virtqueue *vq)
/* Check it isn't doing very strange things with descriptor numbers. */
last_avail_idx = vq->last_avail_idx;
 
-   if (vq->avail_idx == vq->last_avail_idx) {
+   if (unlikely(vq->avail_idx == vq->last_avail_idx)) {
+   /* If we already have work to do, don't bother re-checking. */
+   if (likely(vq->ndescs))
+   return vq->num;
+
if (unlikely(vhost_get_avail_idx(vq, _idx))) {
vq_err(vq, "Failed to access avail idx at %p\n",
>avail->idx);
@@ -2541,6 +2549,24 @@ static int fetch_descs(struct vhost_virtqueue *vq)
return 0;
 }
 
+static int fetch_descs(struct vhost_virtqueue *vq)
+{
+   int ret = 0;
+
+   if (unlikely(vq->first_desc >= vq->ndescs)) {
+   vq->first_desc = 0;
+   vq->ndescs = 0;
+   }
+
+   if (vq->ndescs)
+   return 0;
+
+   while (!ret && vq->ndescs <= vq->batch_descs)
+   ret = fetch_buf(vq);
+
+   return vq->ndescs ? 0 : ret;
+}
+
 /* This looks in the virtqueue and for the first available buffer, and converts
  * it to an iovec for convenient access.  Since descriptors consist of some
  * number of output then some number of input descriptors, it's actually two
@@ -2562,6 +2588,8 @@ int vhost_get_vq_desc_batch(struct vhost_virtqueue *vq,
if (ret)
return ret;
 
+   /* Note: indirect descriptors are not batched */
+   /* TODO: batch up to a limit */
last = peek_split_desc(vq);
id = last->id;
 
@@ -2584,12 +2612,12 @@ int vhost_get_vq_desc_batch(struct vhost_virtqueue *vq,
if (unlikely(log))
*log_num = 0;
 
-   for (i = 0; i < vq->ndescs; ++i) {
+   for (i = vq->first_desc; i < vq->ndescs; ++i) {
unsigned iov_count = *in_num + *out_num;
struct vhost_desc *desc = >descs[i];
int access;
 
-   if (desc->flags &

[PATCH RFC v2 0/2] vhost: ring format independence

2019-10-12 Thread Michael S. Tsirkin

This adds infrastructure required for supporting
multiple ring formats.

The idea is as follows: we convert descriptors to an
independent format first, and process that converting to
iov later.

The point is that we have a tight loop that fetches
descriptors, which is good for cache utilization.
This will also allow all kind of batching tricks -
e.g. it seems possible to keep SMAP disabled while
we are fetching multiple descriptors.

This seems to perform exactly the same as the original
code already based on a microbenchmark.
More testing would be very much appreciated.

Biggest TODO before this first step is ready to go in is to
batch indirect descriptors as well.

Integrating into vhost-net is basically
s/vhost_get_vq_desc/vhost_get_vq_desc_batch/ -
or add a module parameter like I did in the test module.


Changes from v1:
- typo fixes

Michael S. Tsirkin (2):
  vhost: option to fetch descriptors through an independent struct
  vhost: batching fetches

 drivers/vhost/test.c  |  19 ++-
 drivers/vhost/vhost.c | 333 +-
 drivers/vhost/vhost.h |  20 ++-
 3 files changed, 365 insertions(+), 7 deletions(-)

-- 
MST

[PATCH 7/7] Add a new sysctl for limiting userfaultfd to user mode faults

2019-10-12 Thread Daniel Colascione

Add a new sysctl knob unprivileged_userfaultfd_user_mode_only.
This sysctl can be set to either zero or one. When zero (the default)
the system lets all users call userfaultfd with or without
UFFD_USER_MODE_ONLY, modulo other access controls. When
unprivileged_userfaultfd_user_mode_only is set to one, users without
CAP_SYS_PTRACE must pass UFFD_USER_MODE_ONLY to userfaultfd or the API
will fail with EPERM. This facility allows administrators to reduce
the likelihood that an attacker with access to userfaultfd can delay
faulting kernel code to widen timing windows for other exploits.

Signed-off-by: Daniel Colascione 
---
 Documentation/admin-guide/sysctl/vm.rst | 13 +
 fs/userfaultfd.c| 12 ++--
 include/linux/userfaultfd_k.h   |  1 +
 kernel/sysctl.c |  9 +
 4 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/sysctl/vm.rst 
b/Documentation/admin-guide/sysctl/vm.rst
index 6664eec7bd35..330fd82b3f4e 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -849,6 +849,19 @@ they pass the UFFD_SECURE, enabling MAC security checks.
 
 The default value is 1.
 
+unprivileged_userfaultfd_user_mode_only
+
+
+This flag controls whether unprivileged users can use the userfaultfd
+system calls to handle page faults in kernel mode.  If set to zero,
+userfaultfd works with or without UFFD_USER_MODE_ONLY, modulo
+unprivileged_userfaultfd above.  If set to one, users without
+SYS_CAP_PTRACE must pass UFFD_USER_MODE_ONLY in order for userfaultfd
+to succeed.  Prohibiting use of userfaultfd for handling faults from
+kernel mode may make certain vulnerabilities more difficult
+to exploit.
+
+The default value is 0.
 
 user_reserve_kbytes
 ===
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index aaed9347973e..02addd425ab7 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -29,6 +29,7 @@
 #include 
 
 int sysctl_unprivileged_userfaultfd __read_mostly = 1;
+int sysctl_unprivileged_userfaultfd_user_mode_only __read_mostly = 0;
 
 static struct kmem_cache *userfaultfd_ctx_cachep __read_mostly;
 
@@ -1963,8 +1964,15 @@ SYSCALL_DEFINE1(userfaultfd, int, flags)
struct userfaultfd_ctx *ctx;
int fd;
static const int uffd_flags = UFFD_SECURE | UFFD_USER_MODE_ONLY;
-   bool need_cap_check = sysctl_unprivileged_userfaultfd == 0 ||
-   (sysctl_unprivileged_userfaultfd == 2 && !(flags & 
UFFD_SECURE));
+   bool need_cap_check = false;
+
+   if (sysctl_unprivileged_userfaultfd == 0 ||
+   (sysctl_unprivileged_userfaultfd == 2 && !(flags & UFFD_SECURE)))
+   need_cap_check = true;
+
+   if (sysctl_unprivileged_userfaultfd_user_mode_only &&
+   (flags & UFFD_USER_MODE_ONLY) == 0)
+   need_cap_check = true;
 
if (need_cap_check && !capable(CAP_SYS_PTRACE))
return -EPERM;
diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
index 549c8b0cca52..efe14abb2dc8 100644
--- a/include/linux/userfaultfd_k.h
+++ b/include/linux/userfaultfd_k.h
@@ -29,6 +29,7 @@
 #define UFFD_FLAGS_SET (EFD_SHARED_FCNTL_FLAGS)
 
 extern int sysctl_unprivileged_userfaultfd;
+extern int sysctl_unprivileged_userfaultfd_user_mode_only;
 
 extern const struct file_operations userfaultfd_fops;
 
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index fc98d5df344e..4f296676c0ac 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1740,6 +1740,15 @@ static struct ctl_table vm_table[] = {
.extra1 = SYSCTL_ZERO,
.extra2 = ,
},
+   {
+   .procname   = "unprivileged_userfaultfd_user_mode_only",
+   .data   = 
_unprivileged_userfaultfd_user_mode_only,
+   .maxlen = 
sizeof(sysctl_unprivileged_userfaultfd_user_mode_only),
+   .mode   = 0644,
+   .proc_handler   = proc_dointvec_minmax,
+   .extra1 = SYSCTL_ZERO,
+   .extra2 = SYSCTL_ONE,
+   },
 #endif
{ }
 };
-- 
2.23.0.700.g56cf767bdb-goog

Re: [PATCH 1/2] drm/imx: Fix error handling for a kmemdup() call in imx_pd_bind()

2019-10-12 Thread Navid Emamdoost

On Sat, Oct 12, 2019 at 4:07 AM Markus Elfring  wrote:
>
> From: Markus Elfring 
> Date: Sat, 12 Oct 2019 10:30:21 +0200
>
> The return value from a call of the function “kmemdup” was not checked
> in this function implementation. Thus add the corresponding error handling.
>
> Fixes: 19022aaae677dfa171a719e9d1ff04823ce65a65 ("staging: drm/imx: Add 
> parallel display support")
> Signed-off-by: Markus Elfring 
> ---
>  drivers/gpu/drm/imx/parallel-display.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/imx/parallel-display.c 
> b/drivers/gpu/drm/imx/parallel-display.c
> index 35518e5de356..39c4798f56b6 100644
> --- a/drivers/gpu/drm/imx/parallel-display.c
> +++ b/drivers/gpu/drm/imx/parallel-display.c
> @@ -210,8 +210,13 @@ static int imx_pd_bind(struct device *dev, struct device 
> *master, void *data)
> return -ENOMEM;
>
> edidp = of_get_property(np, "edid", >edid_len);
> -   if (edidp)
> +   if (edidp) {
> imxpd->edid = kmemdup(edidp, imxpd->edid_len, GFP_KERNEL);
> +   if (!imxpd->edid) {
> +   devm_kfree(dev, imxpd);

You should not try to free imxpd here as it is a resource-managed
allocation via devm_kzalloc(). It means memory allocated with this
function is
 automatically freed on driver detach. So, this patch introduces a double-free.

> +   return -ENOMEM;
> +   }
> +   }
>
> ret = of_property_read_string(np, "interface-pix-fmt", );
> if (!ret) {
> --
> 2.23.0
>


-- 
Navid.

Re: Linux 5.3.6

2019-10-12 Thread Chris Clayton



> I'm announcing the release of the 5.3.6 kernel.


5.3.6 build fails here with:

arch/x86/entry/vdso/vdso64.so.dbg: undefined symbols found
  CC  arch/x86/kernel/cpu/mce/threshold.o
make[3]: *** [arch/x86/entry/vdso/Makefile:59: 
arch/x86/entry/vdso/vdso64.so.dbg] Error 1
make[3]: *** Deleting file 'arch/x86/entry/vdso/vdso64.so.dbg'
make[2]: *** [scripts/Makefile.build:497: arch/x86/entry/vdso] Error 2
make[1]: *** [scripts/Makefile.build:497: arch/x86/entry] Error 2
make[1]: *** Waiting for unfinished jobs

Chris Clayton

[PATCH 6/7] Allow users to require UFFD_SECURE

2019-10-12 Thread Daniel Colascione

This change adds 2 as an allowable value for
unprivileged_userfaultfd. (Previously, this sysctl could be either 0
or 1.) When unprivileged_userfaultfd is 2, users with CAP_SYS_PTRACE
may create userfaultfd with or without UFFD_SECURE, but users without
CAP_SYS_PTRACE must pass UFFD_SECURE to userfaultfd in order for the
system call to succeed, effectively forcing them to opt into
additional security checks.

Signed-off-by: Daniel Colascione 
---
 Documentation/admin-guide/sysctl/vm.rst | 6 --
 fs/userfaultfd.c| 4 +++-
 kernel/sysctl.c | 2 +-
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/sysctl/vm.rst 
b/Documentation/admin-guide/sysctl/vm.rst
index 64aeee1009ca..6664eec7bd35 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -842,8 +842,10 @@ unprivileged_userfaultfd
 
 This flag controls whether unprivileged users can use the userfaultfd
 system calls.  Set this to 1 to allow unprivileged users to use the
-userfaultfd system calls, or set this to 0 to restrict userfaultfd to only
-privileged users (with SYS_CAP_PTRACE capability).
+userfaultfd system calls, or set this to 0 to restrict userfaultfd to
+only privileged users (with SYS_CAP_PTRACE capability).  If set to 2,
+unprivileged (non-SYS_CAP_PTRACE) users may use userfaultfd only if
+they pass the UFFD_SECURE, enabling MAC security checks.
 
 The default value is 1.
 
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 986d23b2cd33..aaed9347973e 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1963,8 +1963,10 @@ SYSCALL_DEFINE1(userfaultfd, int, flags)
struct userfaultfd_ctx *ctx;
int fd;
static const int uffd_flags = UFFD_SECURE | UFFD_USER_MODE_ONLY;
+   bool need_cap_check = sysctl_unprivileged_userfaultfd == 0 ||
+   (sysctl_unprivileged_userfaultfd == 2 && !(flags & 
UFFD_SECURE));
 
-   if (!sysctl_unprivileged_userfaultfd && !capable(CAP_SYS_PTRACE))
+   if (need_cap_check && !capable(CAP_SYS_PTRACE))
return -EPERM;
 
BUG_ON(!current->mm);
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 00fcea236eba..fc98d5df344e 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1738,7 +1738,7 @@ static struct ctl_table vm_table[] = {
.mode   = 0644,
.proc_handler   = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
-   .extra2 = SYSCTL_ONE,
+   .extra2 = ,
},
 #endif
{ }
-- 
2.23.0.700.g56cf767bdb-goog

[PATCH 4/7] Teach SELinux about a new userfaultfd class

2019-10-12 Thread Daniel Colascione

Use the secure anonymous inode LSM hook we just added to let SELinux
policy place restrictions on userfaultfd use. The create operation
applies to processes creating new instances of these file objects;
transfer between processes is covered by restrictions on read, write,
and ioctl access already checked inside selinux_file_receive.

Signed-off-by: Daniel Colascione 
---
 fs/userfaultfd.c|  4 +-
 include/linux/userfaultfd_k.h   |  2 +
 security/selinux/hooks.c| 68 +
 security/selinux/include/classmap.h |  2 +
 4 files changed, 73 insertions(+), 3 deletions(-)

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 29f920fb236e..1123089c3d55 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1014,8 +1014,6 @@ static __poll_t userfaultfd_poll(struct file *file, 
poll_table *wait)
}
 }
 
-static const struct file_operations userfaultfd_fops;
-
 static int resolve_userfault_fork(struct userfaultfd_ctx *ctx,
  struct userfaultfd_ctx *new,
  struct uffd_msg *msg)
@@ -1934,7 +1932,7 @@ static void userfaultfd_show_fdinfo(struct seq_file *m, 
struct file *f)
 }
 #endif
 
-static const struct file_operations userfaultfd_fops = {
+const struct file_operations userfaultfd_fops = {
 #ifdef CONFIG_PROC_FS
.show_fdinfo= userfaultfd_show_fdinfo,
 #endif
diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
index ac9d71e24b81..549c8b0cca52 100644
--- a/include/linux/userfaultfd_k.h
+++ b/include/linux/userfaultfd_k.h
@@ -30,6 +30,8 @@
 
 extern int sysctl_unprivileged_userfaultfd;
 
+extern const struct file_operations userfaultfd_fops;
+
 extern vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason);
 
 extern ssize_t mcopy_atomic(struct mm_struct *dst_mm, unsigned long dst_start,
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 9625b99e677f..0b3a36cbfbdc 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -92,6 +92,10 @@
 #include 
 #include 
 
+#ifdef CONFIG_USERFAULTFD
+#include 
+#endif
+
 #include "avc.h"
 #include "objsec.h"
 #include "netif.h"
@@ -2943,6 +2947,69 @@ static int selinux_inode_init_security(struct inode 
*inode, struct inode *dir,
return 0;
 }
 
+static int selinux_inode_init_security_anon(struct inode *inode,
+   const char *name,
+   const struct file_operations *fops)
+{
+   const struct task_security_struct *tsec = selinux_cred(current_cred());
+   struct common_audit_data ad;
+   struct inode_security_struct *isec;
+
+   if (unlikely(IS_PRIVATE(inode)))
+   return 0;
+
+   /*
+* We shouldn't be creating secure anonymous inodes before LSM
+* initialization completes.
+*/
+   if (unlikely(!selinux_state.initialized))
+   return -EBUSY;
+
+   isec = selinux_inode(inode);
+
+   /*
+* We only get here once per ephemeral inode.  The inode has
+* been initialized via inode_alloc_security but is otherwise
+* untouched, so check that the state is as
+* inode_alloc_security left it.
+*/
+   BUG_ON(isec->initialized != LABEL_INVALID);
+   BUG_ON(isec->sclass != SECCLASS_FILE);
+
+#ifdef CONFIG_USERFAULTFD
+   if (fops == _fops)
+   isec->sclass = SECCLASS_UFFD;
+#endif
+
+   if (isec->sclass == SECCLASS_FILE) {
+   printk(KERN_WARNING "refusing to create secure anonymous inode "
+  "of unknown type");
+   return -EOPNOTSUPP;
+   }
+   /*
+* Always give secure anonymous inodes the sid of the
+* creating task.
+*/
+
+   isec->sid = tsec->sid;
+   isec->initialized = LABEL_INITIALIZED;
+
+   /*
+* Now that we've initialized security, check whether we're
+* allowed to actually create this type of anonymous inode.
+*/
+
+   ad.type = LSM_AUDIT_DATA_INODE;
+   ad.u.inode = inode;
+
+   return avc_has_perm(_state,
+   tsec->sid,
+   isec->sid,
+   isec->sclass,
+   FILE__CREATE,
+   );
+}
+
 static int selinux_inode_create(struct inode *dir, struct dentry *dentry, 
umode_t mode)
 {
return may_create(dir, dentry, SECCLASS_FILE);
@@ -6840,6 +6907,7 @@ static struct security_hook_list selinux_hooks[] 
__lsm_ro_after_init = {
LSM_HOOK_INIT(inode_alloc_security, selinux_inode_alloc_security),
LSM_HOOK_INIT(inode_free_security, selinux_inode_free_security),
LSM_HOOK_INIT(inode_init_security, selinux_inode_init_security),
+   LSM_HOOK_INIT(inode_init_security_anon, 
selinux_inode_init_security_anon),
LSM_HOOK_INIT(inode_create, selinux_inode_create),

[PATCH 2/7] Add a concept of a "secure" anonymous file

2019-10-12 Thread Daniel Colascione

A secure anonymous file is one we hooked up to its own inode (as
opposed to the shared inode we use for non-secure anonymous files). A
new selinux hook gives security modules a chance to initialize, label,
and veto the creation of these secure anonymous files. Security
modules had limit ability to interact with non-secure anonymous files
due to all of these files sharing a single inode.

Signed-off-by: Daniel Colascione 
---
 fs/anon_inodes.c  | 45 ++-
 include/linux/lsm_hooks.h |  8 +++
 include/linux/security.h  |  2 ++
 security/security.c   |  8 +++
 4 files changed, 53 insertions(+), 10 deletions(-)

diff --git a/fs/anon_inodes.c b/fs/anon_inodes.c
index caa36019afca..d68d76523ad3 100644
--- a/fs/anon_inodes.c
+++ b/fs/anon_inodes.c
@@ -55,6 +55,23 @@ static struct file_system_type anon_inode_fs_type = {
.kill_sb= kill_anon_super,
 };
 
+struct inode *anon_inode_make_secure_inode(const char *name,
+  const struct file_operations *fops)
+{
+   struct inode *inode;
+   int error;
+   inode = alloc_anon_inode(anon_inode_mnt->mnt_sb);
+   if (IS_ERR(inode))
+   return ERR_PTR(PTR_ERR(inode));
+   inode->i_flags &= ~S_PRIVATE;
+   error = security_inode_init_security_anon(inode, name, fops);
+   if (error) {
+   iput(inode);
+   return ERR_PTR(error);
+   }
+   return inode;
+}
+
 /**
  * anon_inode_getfile2 - creates a new file instance by hooking it up to
  *   an anonymous inode, and a dentry that describe
@@ -72,7 +89,9 @@ static struct file_system_type anon_inode_fs_type = {
  * hence saving memory and avoiding code duplication for the file/inode/dentry
  * setup.  Returns the newly created file* or an error pointer.
  *
- * anon_inode_flags must be zero.
+ * If anon_inode_flags contains ANON_INODE_SECURE, create a new inode
+ * and enable security checks for it. Otherwise, attach a new file to
+ * a singleton placeholder inode with security checks disabled.
  */
 struct file *anon_inode_getfile2(const char *name,
 const struct file_operations *fops,
@@ -81,17 +100,23 @@ struct file *anon_inode_getfile2(const char *name,
struct inode *inode;
struct file *file;
 
-   if (anon_inode_flags)
+   if (anon_inode_flags & ~ANON_INODE_SECURE)
return ERR_PTR(-EINVAL);
 
-   inode = anon_inode_inode;
-   if (IS_ERR(inode))
-   return ERR_PTR(-ENODEV);
-   /*
-* We know the anon_inode inode count is always
-* greater than zero, so ihold() is safe.
-*/
-   ihold(inode);
+   if (anon_inode_flags & ANON_INODE_SECURE) {
+   inode = anon_inode_make_secure_inode(name, fops);
+   if (IS_ERR(inode))
+   return ERR_PTR(PTR_ERR(inode));
+   } else {
+   inode = anon_inode_inode;
+   if (IS_ERR(inode))
+   return ERR_PTR(-ENODEV);
+   /*
+* We know the anon_inode inode count is always
+* greater than zero, so ihold() is safe.
+*/
+   ihold(inode);
+   }
 
if (fops->owner && !try_module_get(fops->owner)) {
file = ERR_PTR(-ENOENT);
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index a3763247547c..3744ce9e9172 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -215,6 +215,10 @@
  * Returns 0 if @name and @value have been successfully set,
  * -EOPNOTSUPP if no security attribute is needed, or
  * -ENOMEM on memory allocation failure.
+ * @inode_init_security_anon:
+ *  Set up a secure anonymous inode.
+ * Returns 0 on success. Returns -EPERM if the security module denies
+ * the creation of this inode.
  * @inode_create:
  * Check permission to create a regular file.
  * @dir contains inode structure of the parent of the new file.
@@ -1552,6 +1556,9 @@ union security_list_options {
const struct qstr *qstr,
const char **name, void **value,
size_t *len);
+   int (*inode_init_security_anon)(struct inode *inode,
+   const char *name,
+   const struct file_operations *fops);
int (*inode_create)(struct inode *dir, struct dentry *dentry,
umode_t mode);
int (*inode_link)(struct dentry *old_dentry, struct inode *dir,
@@ -1876,6 +1883,7 @@ struct security_hook_heads {
struct hlist_head inode_alloc_security;
struct hlist_head inode_free_security;
struct hlist_head inode_init_security;
+   struct hlist_head inode_init_security_anon;
struct hlist_head inode_create;

[PATCH 1/7] Add a new flags-accepting interface for anonymous inodes

2019-10-12 Thread Daniel Colascione

Add functions forwarding from the old names to the new ones so we
don't need to change any callers.

Signed-off-by: Daniel Colascione 
---
 fs/anon_inodes.c| 62 ++---
 include/linux/anon_inodes.h | 27 +---
 2 files changed, 59 insertions(+), 30 deletions(-)

diff --git a/fs/anon_inodes.c b/fs/anon_inodes.c
index 89714308c25b..caa36019afca 100644
--- a/fs/anon_inodes.c
+++ b/fs/anon_inodes.c
@@ -56,60 +56,71 @@ static struct file_system_type anon_inode_fs_type = {
 };
 
 /**
- * anon_inode_getfile - creates a new file instance by hooking it up to an
- *  anonymous inode, and a dentry that describe the "class"
- *  of the file
+ * anon_inode_getfile2 - creates a new file instance by hooking it up to
+ *   an anonymous inode, and a dentry that describe
+ *   the "class" of the file
  *
  * @name:[in]name of the "class" of the new file
  * @fops:[in]file operations for the new file
  * @priv:[in]private data for the new file (will be file's 
private_data)
- * @flags:   [in]flags
+ * @flags:   [in]flags for the file
+ * @anon_inode_flags: [in] flags for anon_inode*
  *
- * Creates a new file by hooking it on a single inode. This is useful for files
+ * Creates a new file by hooking it on an unspecified inode. This is useful 
for files
  * that do not need to have a full-fledged inode in order to operate correctly.
  * All the files created with anon_inode_getfile() will share a single inode,
  * hence saving memory and avoiding code duplication for the file/inode/dentry
  * setup.  Returns the newly created file* or an error pointer.
+ *
+ * anon_inode_flags must be zero.
  */
-struct file *anon_inode_getfile(const char *name,
-   const struct file_operations *fops,
-   void *priv, int flags)
+struct file *anon_inode_getfile2(const char *name,
+const struct file_operations *fops,
+void *priv, int flags, int anon_inode_flags)
 {
+   struct inode *inode;
struct file *file;
 
-   if (IS_ERR(anon_inode_inode))
-   return ERR_PTR(-ENODEV);
-
-   if (fops->owner && !try_module_get(fops->owner))
-   return ERR_PTR(-ENOENT);
+   if (anon_inode_flags)
+   return ERR_PTR(-EINVAL);
 
+   inode = anon_inode_inode;
+   if (IS_ERR(inode))
+   return ERR_PTR(-ENODEV);
/*
-* We know the anon_inode inode count is always greater than zero,
-* so ihold() is safe.
+* We know the anon_inode inode count is always
+* greater than zero, so ihold() is safe.
 */
-   ihold(anon_inode_inode);
-   file = alloc_file_pseudo(anon_inode_inode, anon_inode_mnt, name,
+   ihold(inode);
+
+   if (fops->owner && !try_module_get(fops->owner)) {
+   file = ERR_PTR(-ENOENT);
+   goto err;
+   }
+
+   file = alloc_file_pseudo(inode, anon_inode_mnt, name,
 flags & (O_ACCMODE | O_NONBLOCK), fops);
if (IS_ERR(file))
goto err;
 
-   file->f_mapping = anon_inode_inode->i_mapping;
+   file->f_mapping = inode->i_mapping;
 
file->private_data = priv;
 
return file;
 
 err:
-   iput(anon_inode_inode);
+   iput(inode);
module_put(fops->owner);
return file;
 }
 EXPORT_SYMBOL_GPL(anon_inode_getfile);
+EXPORT_SYMBOL_GPL(anon_inode_getfile2);
 
 /**
- * anon_inode_getfd - creates a new file instance by hooking it up to an
- *anonymous inode, and a dentry that describe the "class"
- *of the file
+ * anon_inode_getfd2 - creates a new file instance by hooking it up to an
+ * anonymous inode, and a dentry that describe the "class"
+ * of the file
  *
  * @name:[in]name of the "class" of the new file
  * @fops:[in]file operations for the new file
@@ -122,8 +133,8 @@ EXPORT_SYMBOL_GPL(anon_inode_getfile);
  * hence saving memory and avoiding code duplication for the file/inode/dentry
  * setup.  Returns new descriptor or an error code.
  */
-int anon_inode_getfd(const char *name, const struct file_operations *fops,
-void *priv, int flags)
+int anon_inode_getfd2(const char *name, const struct file_operations *fops,
+ void *priv, int flags, int anon_inode_flags)
 {
int error, fd;
struct file *file;
@@ -133,7 +144,7 @@ int anon_inode_getfd(const char *name, const struct 
file_operations *fops,
return error;
fd = error;
 
-   file = anon_inode_getfile(name, fops, priv, flags);
+   file = anon_inode_getfile2(name, fops, priv, flags, anon_inode_flags);
if (IS_ERR(file)) {
error = PTR_ERR(file);
goto

[PATCH 3/7] Add a UFFD_SECURE flag to the userfaultfd API.

2019-10-12 Thread Daniel Colascione

The new secure flag makes userfaultfd use a new "secure" anonymous
file object instead of the default one, letting security modules
supervise userfaultfd use.

Requiring that users pass a new flag lets us avoid changing the
semantics for existing callers.

Signed-off-by: Daniel Colascione 
---
 fs/userfaultfd.c | 28 +---
 include/uapi/linux/userfaultfd.h |  8 
 2 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index f9fd18670e22..29f920fb236e 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1022,6 +1022,13 @@ static int resolve_userfault_fork(struct userfaultfd_ctx 
*ctx,
 {
int fd;
 
+   /*
+* Using a secure-mode UFFD to monitor forks isn't supported
+* right now.
+*/
+   if (new->flags & UFFD_SECURE)
+   return -EOPNOTSUPP;
+
fd = anon_inode_getfd("[userfaultfd]", _fops, new,
  O_RDWR | (new->flags & UFFD_SHARED_FCNTL_FLAGS));
if (fd < 0)
@@ -1841,6 +1848,18 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx,
ret = -EINVAL;
goto out;
}
+   if ((ctx->flags & UFFD_SECURE) &&
+   (features & UFFD_FEATURE_EVENT_FORK)) {
+   /*
+* We don't support UFFD_FEATURE_EVENT_FORK on a
+* secure-mode UFFD: doing so would need us to
+* construct the new file object in the context of the
+* fork child, and it's not worth it right now.
+*/
+   ret = -EINVAL;
+   goto out;
+   }
+
/* report all available features and ioctls to userland */
uffdio_api.features = UFFD_API_FEATURES;
uffdio_api.ioctls = UFFD_API_IOCTLS;
@@ -1942,6 +1961,7 @@ SYSCALL_DEFINE1(userfaultfd, int, flags)
 {
struct userfaultfd_ctx *ctx;
int fd;
+   static const int uffd_flags = UFFD_SECURE;
 
if (!sysctl_unprivileged_userfaultfd && !capable(CAP_SYS_PTRACE))
return -EPERM;
@@ -1951,8 +1971,9 @@ SYSCALL_DEFINE1(userfaultfd, int, flags)
/* Check the UFFD_* constants for consistency.  */
BUILD_BUG_ON(UFFD_CLOEXEC != O_CLOEXEC);
BUILD_BUG_ON(UFFD_NONBLOCK != O_NONBLOCK);
+   BUILD_BUG_ON(UFFD_SHARED_FCNTL_FLAGS & uffd_flags);
 
-   if (flags & ~UFFD_SHARED_FCNTL_FLAGS)
+   if (flags & ~(UFFD_SHARED_FCNTL_FLAGS | uffd_flags))
return -EINVAL;
 
ctx = kmem_cache_alloc(userfaultfd_ctx_cachep, GFP_KERNEL);
@@ -1969,8 +1990,9 @@ SYSCALL_DEFINE1(userfaultfd, int, flags)
/* prevent the mm struct to be freed */
mmgrab(ctx->mm);
 
-   fd = anon_inode_getfd("[userfaultfd]", _fops, ctx,
- O_RDWR | (flags & UFFD_SHARED_FCNTL_FLAGS));
+   fd = anon_inode_getfd2("[userfaultfd]", _fops, ctx,
+  O_RDWR | (flags & UFFD_SHARED_FCNTL_FLAGS),
+  ((flags & UFFD_SECURE) ? ANON_INODE_SECURE : 0));
if (fd < 0) {
mmdrop(ctx->mm);
kmem_cache_free(userfaultfd_ctx_cachep, ctx);
diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h
index 48f1a7c2f1f0..12d7d40d7f25 100644
--- a/include/uapi/linux/userfaultfd.h
+++ b/include/uapi/linux/userfaultfd.h
@@ -231,4 +231,12 @@ struct uffdio_zeropage {
__s64 zeropage;
 };
 
+/*
+ * Flags for the userfaultfd(2) system call itself.
+ */
+
+/*
+ * Create a userfaultfd with MAC security checks enabled.
+ */
+#define UFFD_SECURE 1
 #endif /* _LINUX_USERFAULTFD_H */
-- 
2.23.0.700.g56cf767bdb-goog

[PATCH 0/7] Harden userfaultfd

2019-10-12 Thread Daniel Colascione

Userfaultfd in unprivileged contexts could be potentially very
useful. We'd like to harden userfaultfd to make such unprivileged use
less risky. This patch series allows SELinux to manage userfaultfd
file descriptors (via a new flag, for compatibility with existing
code) and allows administrators to limit userfaultfd to servicing
user-mode faults, increasing the difficulty of using userfaultfd in
exploit chains invoking delaying kernel faults.

A new anon_inodes interface allows callers to opt into SELinux
management of anonymous file objects. In this mode, anon_inodes
creates new ephemeral inodes for anonymous file objects instead of
reusing a singleton dummy inode. A new LSM hook gives security modules
an opportunity to configure and veto these ephemeral inodes.

Existing anon_inodes users must opt into the new functionality.

Daniel Colascione (7):
  Add a new flags-accepting interface for anonymous inodes
  Add a concept of a "secure" anonymous file
  Add a UFFD_SECURE flag to the userfaultfd API.
  Teach SELinux about a new userfaultfd class
  Let userfaultfd opt out of handling kernel-mode faults
  Allow users to require UFFD_SECURE
  Add a new sysctl for limiting userfaultfd to user mode faults

 Documentation/admin-guide/sysctl/vm.rst | 19 +-
 fs/anon_inodes.c| 89 +
 fs/userfaultfd.c| 47 +++--
 include/linux/anon_inodes.h | 27 ++--
 include/linux/lsm_hooks.h   |  8 +++
 include/linux/security.h|  2 +
 include/linux/userfaultfd_k.h   |  3 +
 include/uapi/linux/userfaultfd.h| 14 
 kernel/sysctl.c |  9 +++
 security/security.c |  8 +++
 security/selinux/hooks.c| 68 +++
 security/selinux/include/classmap.h |  2 +
 12 files changed, 256 insertions(+), 40 deletions(-)

-- 
2.23.0.700.g56cf767bdb-goog

[PATCH 5/7] Let userfaultfd opt out of handling kernel-mode faults

2019-10-12 Thread Daniel Colascione

userfaultfd handles page faults from both user and kernel code.  Add a
new UFFD_USER_MODE_ONLY flag for userfaultfd(2) that makes the
resulting userfaultfd object refuse to handle faults from kernel mode,
treating these faults as if SIGBUS were always raised, causing the
kernel code to fail with EFAULT.

A future patch adds a knob allowing administrators to give some
processes the ability to create userfaultfd file objects only if they
pass UFFD_USER_MODE_ONLY, reducing the likelihood that these processes
will exploit userfaultfd's ability to delay kernel page faults to open
timing windows for future exploits.

Signed-off-by: Daniel Colascione 
---
 fs/userfaultfd.c | 5 -
 include/uapi/linux/userfaultfd.h | 6 ++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 1123089c3d55..986d23b2cd33 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -389,6 +389,9 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned 
long reason)
 
if (ctx->features & UFFD_FEATURE_SIGBUS)
goto out;
+   if ((vmf->flags & FAULT_FLAG_USER) == 0 &&
+   ctx->flags & UFFD_USER_MODE_ONLY)
+   goto out;
 
/*
 * If it's already released don't get it. This avoids to loop
@@ -1959,7 +1962,7 @@ SYSCALL_DEFINE1(userfaultfd, int, flags)
 {
struct userfaultfd_ctx *ctx;
int fd;
-   static const int uffd_flags = UFFD_SECURE;
+   static const int uffd_flags = UFFD_SECURE | UFFD_USER_MODE_ONLY;
 
if (!sysctl_unprivileged_userfaultfd && !capable(CAP_SYS_PTRACE))
return -EPERM;
diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h
index 12d7d40d7f25..eadd1497e7b5 100644
--- a/include/uapi/linux/userfaultfd.h
+++ b/include/uapi/linux/userfaultfd.h
@@ -239,4 +239,10 @@ struct uffdio_zeropage {
  * Create a userfaultfd with MAC security checks enabled.
  */
 #define UFFD_SECURE 1
+
+/*
+ * Create a userfaultfd that can handle page faults only in user mode.
+ */
+#define UFFD_USER_MODE_ONLY 2
+
 #endif /* _LINUX_USERFAULTFD_H */
-- 
2.23.0.700.g56cf767bdb-goog

[GIT PULL] MIPS fixes

2019-10-12 Thread Paul Burton

Hi Linus,

Here are a few MIPS fixes for 5.4; please pull.

Thanks,
Paul


The following changes since commit da0c9ea146cbe92b832f1b0f694840ea8eb33cce:

  Linux 5.4-rc2 (2019-10-06 14:27:30 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux.git 
tags/mips_fixes_5.4_2

for you to fetch changes up to 2f2b4fd674cadd8c6b40eb629e140a14db4068fd:

  MIPS: Disable Loongson MMI instructions for kernel build (2019-10-10 11:58:52 
-0700)


A few MIPS fixes for 5.4:

- Build fixes for CONFIG_OPTIMIZE_INLINING=y builds in which the
  compiler may choose not to inline __xchg() & __cmpxchg().

- A build fix for Loongson configurations with GCC 9.x.

- Expose some extra HWCAP bits to indicate support for various
  instruction set extensions to userland.

- Fix bad stack access in firmware handling code for old SNI
  RM200/300/400 machines.


Jiaxun Yang (1):
  MIPS: elf_hwcap: Export userspace ASEs

Paul Burton (1):
  MIPS: Disable Loongson MMI instructions for kernel build

Thomas Bogendoerfer (3):
  MIPS: include: Mark __cmpxchg as __always_inline
  MIPS: include: Mark __xchg as __always_inline
  MIPS: fw: sni: Fix out of bounds init of o32 stack

 arch/mips/fw/sni/sniprom.c |  2 +-
 arch/mips/include/asm/cmpxchg.h|  9 +
 arch/mips/include/uapi/asm/hwcap.h | 11 +++
 arch/mips/kernel/cpu-probe.c   | 33 +
 arch/mips/loongson64/Platform  |  4 
 arch/mips/vdso/Makefile|  1 +
 6 files changed, 55 insertions(+), 5 deletions(-)


signature.asc
Description: PGP signature

God bless you.

2019-10-12 Thread Anna Mustafa

Dear friend how are you today? 
I know you will be surprise to receive this message from me because; we have n
ot met before but please listen to me very well. I am writing  you this mail f
rom a Hospital. My name is Mrs. Anna Mustafa. I am a  widow and very sick now.
 I am suffering from Endometrial Cancer which  my doctor has confirmed that I 
will not survive it because of some  damages. Now because of the condition of 
my health I have decided to  donate out my late husband fund the sum  of $3, 5
00,000.00 on Charity Purpose through your  help. All you have to do is to use 
the money in the following ways. 

(1) To build school for the poor children. 
(2) To help the Orphanages, Sick People, and Poor Widows etc. If you  agree to
 help me, I will instruct the bank to proceed and transfer the  money to your 
account to enable you start this project on my behalf  since I am very sick no
w and cannot do this work by myself. 

Lastly, after the transfer of the money to your account, I permit you  to take
 out 30% of the money for your recompense in doing this work. I  don’t have a
 child or any available relative who can inherit this  money when I die. I wil
 tell you more about myself and how to proceed  forward on this transaction. 
God bless you. 
Mrs. Anna

Re: [Outreachy kernel] [PATCH v2 3/5] staging: octeon: remove typedef declaration for cvmx_fau_reg_32

2019-10-12 Thread Julia Lawall




On Sat, 12 Oct 2019, Wambui Karuga wrote:

> Remove typedef declaration for enum cvmx_fau_reg_32.
> Also replace its previous uses with new declaration format.
> Issue found by checkpatch.pl
>
> Signed-off-by: Wambui Karuga 
> ---
>  drivers/staging/octeon/octeon-stubs.h | 14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/staging/octeon/octeon-stubs.h 
> b/drivers/staging/octeon/octeon-stubs.h
> index 0991be329139..40f0cfee0dff 100644
> --- a/drivers/staging/octeon/octeon-stubs.h
> +++ b/drivers/staging/octeon/octeon-stubs.h
> @@ -201,9 +201,9 @@ union cvmx_helper_link_info {
>   } s;
>  };
>
> -typedef enum {
> +enum cvmx_fau_reg_32 {
>   CVMX_FAU_REG_32_START   = 0,
> -} cvmx_fau_reg_32_t;
> +};
>
>  typedef enum {
>   CVMX_FAU_OP_SIZE_8 = 0,
> @@ -1178,16 +1178,18 @@ union cvmx_gmxx_rxx_rx_inbnd {
>   } s;
>  };
>
> -static inline int32_t cvmx_fau_fetch_and_add32(cvmx_fau_reg_32_t reg,
> +static inline int32_t cvmx_fau_fetch_and_add32(enum cvmx_fau_reg_32 reg,
>  int32_t value)

These int32_t's don't look very desirable either.  If there is only one
possible definition, you can just replace it by what it is defined to be.

julia

>  {
>   return value;
>  }
>
> -static inline void cvmx_fau_atomic_add32(cvmx_fau_reg_32_t reg, int32_t 
> value)
> +static inline void cvmx_fau_atomic_add32(enum cvmx_fau_reg_32 reg,
> +  int32_t value)
>  { }
>
> -static inline void cvmx_fau_atomic_write32(cvmx_fau_reg_32_t reg, int32_t 
> value)
> +static inline void cvmx_fau_atomic_write32(enum cvmx_fau_reg_32 reg,
> +int32_t value)
>  { }
>
>  static inline uint64_t cvmx_scratch_read64(uint64_t address)
> @@ -1364,7 +1366,7 @@ static inline int cvmx_spi_restart_interface(int 
> interface,
>  }
>
>  static inline void cvmx_fau_async_fetch_and_add32(uint64_t scraddr,
> -   cvmx_fau_reg_32_t reg,
> +   enum cvmx_fau_reg_32 reg,
> int32_t value)
>  { }
>
> --
> 2.23.0
>
> --
> You received this message because you are subscribed to the Google Groups 
> "outreachy-kernel" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to outreachy-kernel+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/outreachy-kernel/b7216f423d8e06b2ed7ac2df643a9215cd95be32.1570821661.git.wambui.karugax%40gmail.com.
>

Re: [Outreachy kernel] [PATCH v2 0/5] Remove typedef declarations in staging: octeon

2019-10-12 Thread Julia Lawall




On Sat, 12 Oct 2019, Wambui Karuga wrote:

> This patchset removes the addition of new typedefs data types in octeon,
> along with replacing the previous uses with the new declaration format.
>
> v2 of the series removes the obsolete "_t" notation in the named types.
>
> Wambui Karuga (5):
>   staging: octeon: remove typedef declaration for cvmx_wqe
>   staging: octeon: remove typedef declaration for cvmx_helper_link_info
>   staging: octeon: remove typedef declaration for cvmx_fau_reg_32
>   staging: octeon: remove typedef declartion for cvmx_pko_command_word0
>   staging: octeon: remove typedef declaration for cvmx_fau_op_size
>
>  drivers/staging/octeon/ethernet-mdio.c   |  6 +--
>  drivers/staging/octeon/ethernet-rgmii.c  |  4 +-
>  drivers/staging/octeon/ethernet-rx.c |  6 +--
>  drivers/staging/octeon/ethernet-tx.c |  4 +-
>  drivers/staging/octeon/ethernet.c|  6 +--
>  drivers/staging/octeon/octeon-ethernet.h |  2 +-
>  drivers/staging/octeon/octeon-stubs.h| 56 
>  7 files changed, 43 insertions(+), 41 deletions(-)

For the series:

Acked-by: Julia Lawall 

>
> --
> 2.23.0
>
> --
> You received this message because you are subscribed to the Google Groups 
> "outreachy-kernel" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to outreachy-kernel+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/outreachy-kernel/cover.1570821661.git.wambui.karugax%40gmail.com.
>

mac80211: Checking a kmemdup() call in ieee80211_send_assoc()

2019-10-12 Thread Markus Elfring

Hello,

I tried another script for the semantic patch language out.
This source code analysis approach points out that the implementation
of the function “ieee80211_send_assoc” contains still an unchecked call
of the function “kmemdup”.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/mac80211/mlme.c?id=1c0cc5f1ae5ee5a6913704c0d75a6e99604ee30a#n980
https://elixir.bootlin.com/linux/v5.4-rc2/source/net/mac80211/mlme.c#L980

How do you think about to improve it?

Regards,
Markus

drivers/net/ethernet/pensando/ionic/ionic_lif.c:333:2: error: implicit declaration of function 'dynamic_hex_dump'

2019-10-12 Thread kbuild test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   1c0cc5f1ae5ee5a6913704c0d75a6e99604ee30a
commit: 77ceb68e29ccd25d923b6af59e74ecaf736cc4b7 ionic: Add notifyq support
date:   5 weeks ago
config: x86_64-randconfig-a002-201941 (attached as .config)
compiler: gcc-6 (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
reproduce:
git checkout 77ceb68e29ccd25d923b6af59e74ecaf736cc4b7
# save the attached .config to linux build tree
make ARCH=x86_64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>):

   drivers/net/ethernet/pensando/ionic/ionic_lif.c: In function 
'ionic_notifyq_service':
>> drivers/net/ethernet/pensando/ionic/ionic_lif.c:333:2: error: implicit 
>> declaration of function 'dynamic_hex_dump' 
>> [-Werror=implicit-function-declaration]
 dynamic_hex_dump("event ", DUMP_PREFIX_OFFSET, 16, 1,
 ^~~~
   cc1: some warnings being treated as errors

vim +/dynamic_hex_dump +333 drivers/net/ethernet/pensando/ionic/ionic_lif.c

   311  
   312  static bool ionic_notifyq_service(struct ionic_cq *cq,
   313struct ionic_cq_info *cq_info)
   314  {
   315  union ionic_notifyq_comp *comp = cq_info->cq_desc;
   316  struct net_device *netdev;
   317  struct ionic_queue *q;
   318  struct ionic_lif *lif;
   319  u64 eid;
   320  
   321  q = cq->bound_q;
   322  lif = q->info[0].cb_arg;
   323  netdev = lif->netdev;
   324  eid = le64_to_cpu(comp->event.eid);
   325  
   326  /* Have we run out of new completions to process? */
   327  if (eid <= lif->last_eid)
   328  return false;
   329  
   330  lif->last_eid = eid;
   331  
   332  dev_dbg(lif->ionic->dev, "notifyq event:\n");
 > 333  dynamic_hex_dump("event ", DUMP_PREFIX_OFFSET, 16, 1,
   334   comp, sizeof(*comp), true);
   335  
   336  switch (le16_to_cpu(comp->event.ecode)) {
   337  case IONIC_EVENT_LINK_CHANGE:
   338  netdev_info(netdev, "Notifyq IONIC_EVENT_LINK_CHANGE 
eid=%lld\n",
   339  eid);
   340  netdev_info(netdev,
   341  "  link_status=%d link_speed=%d\n",
   342  le16_to_cpu(comp->link_change.link_status),
   343  le32_to_cpu(comp->link_change.link_speed));
   344  break;
   345  case IONIC_EVENT_RESET:
   346  netdev_info(netdev, "Notifyq IONIC_EVENT_RESET 
eid=%lld\n",
   347  eid);
   348  netdev_info(netdev, "  reset_code=%d state=%d\n",
   349  comp->reset.reset_code,
   350  comp->reset.state);
   351  break;
   352  default:
   353  netdev_warn(netdev, "Notifyq unknown event ecode=%d 
eid=%lld\n",
   354  comp->event.ecode, eid);
   355  break;
   356  }
   357  
   358  return true;
   359  }
   360  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Hello

2019-10-12 Thread İshak BURAKGAZI

Hi,

Please it's very important we speak and discuss my proposal, regards
the letter I sent to you before on this deposit here.

ishak.

SUNRPC: Checking a kmemdup() call in xdr_netobj_dup()

2019-10-12 Thread Markus Elfring

Hello,

I tried another script for the semantic patch language out.
This source code analysis approach points out that the implementation
of the function “xdr_netobj_dup” contains still an unchecked call
of the function “kmemdup”.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/sunrpc/xdr.h?id=1c0cc5f1ae5ee5a6913704c0d75a6e99604ee30a#n167
https://elixir.bootlin.com/linux/v5.4-rc2/source/include/linux/sunrpc/xdr.h#L167

How do you think about to improve it?

Regards,
Markus

Re: drivers/net/ethernet/pensando/ionic/ionic_lif.c:333:2: error: implicit declaration of function 'dynamic_hex_dump'; did you mean 'seq_hex_dump'?

2019-10-12 Thread Shannon Nelson


On 10/12/19 10:45 AM, kbuild test robot wrote:

Hi Shannon,

FYI, the error/warning still remains.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   1c0cc5f1ae5ee5a6913704c0d75a6e99604ee30a
commit: 77ceb68e29ccd25d923b6af59e74ecaf736cc4b7 ionic: Add notifyq support
date:   5 weeks ago
config: x86_64-randconfig-a002-201941 (attached as .config)
compiler: gcc-7 (Debian 7.4.0-13) 7.4.0
reproduce:
 git checkout 77ceb68e29ccd25d923b6af59e74ecaf736cc4b7
 # save the attached .config to linux build tree
 make ARCH=x86_64

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 


Hmmm, I thought Arnd Bergmann had already addressed these, and I Acked:

https://lore.kernel.org/netdev/91b69922-926a-9c27-3a08-e2db2d7ea...@pensando.io/

Dave, is there something more I need to do here?

sln




All errors (new ones prefixed by >>):

drivers/net/ethernet/pensando/ionic/ionic_lif.c: In function 
'ionic_notifyq_service':

drivers/net/ethernet/pensando/ionic/ionic_lif.c:333:2: error: implicit 
declaration of function 'dynamic_hex_dump'; did you mean 'seq_hex_dump'? 
[-Werror=implicit-function-declaration]

  dynamic_hex_dump("event ", DUMP_PREFIX_OFFSET, 16, 1,
  ^~~~
  seq_hex_dump
cc1: some warnings being treated as errors

vim +333 drivers/net/ethernet/pensando/ionic/ionic_lif.c

311 
312 static bool ionic_notifyq_service(struct ionic_cq *cq,
313   struct ionic_cq_info *cq_info)
314 {
315 union ionic_notifyq_comp *comp = cq_info->cq_desc;
316 struct net_device *netdev;
317 struct ionic_queue *q;
318 struct ionic_lif *lif;
319 u64 eid;
320 
321 q = cq->bound_q;
322 lif = q->info[0].cb_arg;
323 netdev = lif->netdev;
324 eid = le64_to_cpu(comp->event.eid);
325 
326 /* Have we run out of new completions to process? */
327 if (eid <= lif->last_eid)
328 return false;
329 
330 lif->last_eid = eid;
331 
332 dev_dbg(lif->ionic->dev, "notifyq event:\n");
  > 333  dynamic_hex_dump("event ", DUMP_PREFIX_OFFSET, 16, 1,
334  comp, sizeof(*comp), true);
335 
336 switch (le16_to_cpu(comp->event.ecode)) {
337 case IONIC_EVENT_LINK_CHANGE:
338 netdev_info(netdev, "Notifyq IONIC_EVENT_LINK_CHANGE 
eid=%lld\n",
339 eid);
340 netdev_info(netdev,
341 "  link_status=%d link_speed=%d\n",
342 le16_to_cpu(comp->link_change.link_status),
343 le32_to_cpu(comp->link_change.link_speed));
344 break;
345 case IONIC_EVENT_RESET:
346 netdev_info(netdev, "Notifyq IONIC_EVENT_RESET 
eid=%lld\n",
347 eid);
348 netdev_info(netdev, "  reset_code=%d state=%d\n",
349 comp->reset.reset_code,
350 comp->reset.state);
351 break;
352 default:
353 netdev_warn(netdev, "Notifyq unknown event ecode=%d 
eid=%lld\n",
354 comp->event.ecode, eid);
355 break;
356 }
357 
358 return true;
359 }
360 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation

[PATCH v2 5/5] staging: octeon: remove typedef declaration for cvmx_fau_op_size

2019-10-12 Thread Wambui Karuga

Remove addition of new typedef for enum cvmx_fau_op_size.
Issue found by checkpatch.pl

Signed-off-by: Wambui Karuga 
---
 drivers/staging/octeon/octeon-stubs.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/octeon/octeon-stubs.h 
b/drivers/staging/octeon/octeon-stubs.h
index db2d6f64b666..1b72f02a361f 100644
--- a/drivers/staging/octeon/octeon-stubs.h
+++ b/drivers/staging/octeon/octeon-stubs.h
@@ -205,12 +205,12 @@ enum cvmx_fau_reg_32 {
CVMX_FAU_REG_32_START   = 0,
 };
 
-typedef enum {
+enum cvmx_fau_op_size {
CVMX_FAU_OP_SIZE_8 = 0,
CVMX_FAU_OP_SIZE_16 = 1,
CVMX_FAU_OP_SIZE_32 = 2,
CVMX_FAU_OP_SIZE_64 = 3
-} cvmx_fau_op_size_t;
+};
 
 typedef enum {
CVMX_SPI_MODE_UNKNOWN = 0,
-- 
2.23.0

[PATCH v2 3/5] staging: octeon: remove typedef declaration for cvmx_fau_reg_32

2019-10-12 Thread Wambui Karuga

Remove typedef declaration for enum cvmx_fau_reg_32.
Also replace its previous uses with new declaration format.
Issue found by checkpatch.pl

Signed-off-by: Wambui Karuga 
---
 drivers/staging/octeon/octeon-stubs.h | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/octeon/octeon-stubs.h 
b/drivers/staging/octeon/octeon-stubs.h
index 0991be329139..40f0cfee0dff 100644
--- a/drivers/staging/octeon/octeon-stubs.h
+++ b/drivers/staging/octeon/octeon-stubs.h
@@ -201,9 +201,9 @@ union cvmx_helper_link_info {
} s;
 };
 
-typedef enum {
+enum cvmx_fau_reg_32 {
CVMX_FAU_REG_32_START   = 0,
-} cvmx_fau_reg_32_t;
+};
 
 typedef enum {
CVMX_FAU_OP_SIZE_8 = 0,
@@ -1178,16 +1178,18 @@ union cvmx_gmxx_rxx_rx_inbnd {
} s;
 };
 
-static inline int32_t cvmx_fau_fetch_and_add32(cvmx_fau_reg_32_t reg,
+static inline int32_t cvmx_fau_fetch_and_add32(enum cvmx_fau_reg_32 reg,
   int32_t value)
 {
return value;
 }
 
-static inline void cvmx_fau_atomic_add32(cvmx_fau_reg_32_t reg, int32_t value)
+static inline void cvmx_fau_atomic_add32(enum cvmx_fau_reg_32 reg,
+int32_t value)
 { }
 
-static inline void cvmx_fau_atomic_write32(cvmx_fau_reg_32_t reg, int32_t 
value)
+static inline void cvmx_fau_atomic_write32(enum cvmx_fau_reg_32 reg,
+  int32_t value)
 { }
 
 static inline uint64_t cvmx_scratch_read64(uint64_t address)
@@ -1364,7 +1366,7 @@ static inline int cvmx_spi_restart_interface(int 
interface,
 }
 
 static inline void cvmx_fau_async_fetch_and_add32(uint64_t scraddr,
- cvmx_fau_reg_32_t reg,
+ enum cvmx_fau_reg_32 reg,
  int32_t value)
 { }
 
-- 
2.23.0

1 2 3 >

1 - 100 of 269 matches

Mail list logo