Re: [PATCH RFC 0/5] rcu doc updates for whatisRCU and checklist

2018-10-05 Thread Theodore Y. Ts'o
On Fri, Oct 05, 2018 at 08:45:40PM -0700, Paul E. McKenney wrote:
> 
> Shouldn't the synchronize_rcu() precede the loop doing the kfree()
> calls?  Or am I missing something subtle?

No, that was a cut and paste error on my part.  I was removing the
rcu_read_unlock() before the kfree loop, and accidentally removed the
synchronize_rcu().  Then when I put it back, I put it back in the
right place.

The longer version:

I originally used rcu_read_lock() and rcu_read_unlock() around setting
up to_free[] --- since whatisRCU.txt didn't talk about
rcu_derefence_proctected(), just rcu_dereference() in Section 2: "What
is RCU's Core API?"   

Then when I looked at the example in Section 3, I was surprised when I
didn't see the rcu_read_[un]lock() on the updater side, and spent some
time trying to figure out how to use rcu_dereference_protected().

Then when I did the transumation from
rcu_read_lock/rcu_dereference_protected/rcu_read_unlock to
rcu_dereference_protected, I bobbled the location of
synchronize_rcu().

- Ted

P.S.  Pedagogically, it might make sense to show an example that only
uses the RCU core API --- I assume using rcu_read_[un]lock() and
rcu_dereference() does work; it's just non-optimal, right?  --- and
then introduce the use of rcu_dereference_protected() afterwards.


Re: [PATCH RFC 0/5] rcu doc updates for whatisRCU and checklist

2018-10-05 Thread Joel Fernandes
On Fri, Oct 05, 2018 at 08:45:40PM -0700, Paul E. McKenney wrote:
> On Fri, Oct 05, 2018 at 07:46:28PM -0400, Theodore Y. Ts'o wrote:
> > On Fri, Oct 05, 2018 at 04:18:09PM -0700, Joel Fernandes (Google) wrote:
> > > 
> > > Here are this week's rcu doc updates based on combing through whatisRCU 
> > > and
> > > checklists. Hopefully you agree with them. I left several old _bh and 
> > > _sched
> > > API references as is, since I don't think its a good idea to remove them 
> > > till
> > > the APIs themselves are removed, however I did remove several of them as 
> > > well
> > > (like in the first patch in this series) since I feel its better to 
> > > "encourage"
> > > new users not to use the old API.
> > 
> > Hi Joel,
> > 
> > As it so happens, I just recently wrote my first RCU patch[1] (file
> > systems, especially on-disk data structures, generally tend not to be
> > good candidates for RCU semantics).
> > 
> > [1] http://patchwork.ozlabs.org/patch/979779/
> 
> Very cool!
> 
> One question...  In the following hunk:
> 
> 
> 
> @@ -5353,9 +5362,13 @@  static int ext4_remount(struct super_block *sb, int 
> *flags, char *data)
>  #ifdef CONFIG_QUOTA
>   sbi->s_jquota_fmt = old_opts.s_jquota_fmt;
>   for (i = 0; i < EXT4_MAXQUOTAS; i++) {
> - kfree(sbi->s_qf_names[i]);
> - sbi->s_qf_names[i] = old_opts.s_qf_names[i];
> + to_free[i] = rcu_dereference_protected(sbi->s_qf_names[i],
> +&sb->s_umount);
> + rcu_assign_pointer(sbi->s_qf_names[i], old_opts.s_qf_names[i]);
>   }
> + for (i = 0; i < EXT4_MAXQUOTAS; i++)
> + kfree(to_free[i]);
> + synchronize_rcu();
>  #endif
>   kfree(orig_data);
>   return err;
> 
> 
> 
> Shouldn't the synchronize_rcu() precede the loop doing the kfree()
> calls?  Or am I missing something subtle?
> 
> Otherwise, looks good!  I was worried that seq_show_option() might
> sleep, but it looks like it is just putting characters into an
> array.  If there is lingering concern, CONFIG_PROVE_LOCKING will
> usually catch that sort of thing.

Also I was wondering if the "if (sbi->s_qf_names[USRQUOTA])" in the patch
should be "if (rcu_dereference(sbi->s_qf_names[USRQUOTA]))". I don't think
the compiler could optimize the access in this case, bit IMO using the
rcu_dereference would serve to document that its an RCU protected pointer
anyway.

thanks,

 - Joel



Re: [PATCH RFC 0/5] rcu doc updates for whatisRCU and checklist

2018-10-05 Thread Joel Fernandes
On Fri, Oct 05, 2018 at 08:45:40PM -0700, Paul E. McKenney wrote:
> On Fri, Oct 05, 2018 at 07:46:28PM -0400, Theodore Y. Ts'o wrote:
> > On Fri, Oct 05, 2018 at 04:18:09PM -0700, Joel Fernandes (Google) wrote:
> > > 
> > > Here are this week's rcu doc updates based on combing through whatisRCU 
> > > and
> > > checklists. Hopefully you agree with them. I left several old _bh and 
> > > _sched
> > > API references as is, since I don't think its a good idea to remove them 
> > > till
> > > the APIs themselves are removed, however I did remove several of them as 
> > > well
> > > (like in the first patch in this series) since I feel its better to 
> > > "encourage"
> > > new users not to use the old API.
> > 
> > Hi Joel,
> > 
> > As it so happens, I just recently wrote my first RCU patch[1] (file
> > systems, especially on-disk data structures, generally tend not to be
> > good candidates for RCU semantics).
> > 
> > [1] http://patchwork.ozlabs.org/patch/979779/
> 
> Very cool!
> 
> One question...  In the following hunk:
> 
> 
> 
> @@ -5353,9 +5362,13 @@  static int ext4_remount(struct super_block *sb, int 
> *flags, char *data)
>  #ifdef CONFIG_QUOTA
>   sbi->s_jquota_fmt = old_opts.s_jquota_fmt;
>   for (i = 0; i < EXT4_MAXQUOTAS; i++) {
> - kfree(sbi->s_qf_names[i]);
> - sbi->s_qf_names[i] = old_opts.s_qf_names[i];

Could you annotate this pointer (sbi->s_qf_names) with __rcu so it can be
checked by sparse for proper usage? Its also point #16 in the checklist.txt
RCU document. I enclosed a diff to do this below.

I also saw a bunch of places in super.c where the pointer isn't accessed from
an rcu read section or rcu_dereference, but it was a quick look so sorry if I
missed something. If its true, then are you planning to convert these to use
rcu_dereference and wrapped by an rcu_read_lock/unlock as well?

> + to_free[i] = rcu_dereference_protected(sbi->s_qf_names[i],
> +&sb->s_umount);

Also should this be the following?
to_free[i] = rcu_dereference_protected(sbi->s_qf_names[i],
   lockdep_is_held(&sb->s_umount));

> + rcu_assign_pointer(sbi->s_qf_names[i], old_opts.s_qf_names[i]);
>   }
> + for (i = 0; i < EXT4_MAXQUOTAS; i++)
> + kfree(to_free[i]);
> + synchronize_rcu();

I had same concern as Paul here about synchronize_rcu done before the kfree.

thanks,

 - Joel

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 5863fd22e90b..eec1b3090d04 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5083,7 +5083,7 @@ struct ext4_mount_options {
u32 s_min_batch_time, s_max_batch_time;
 #ifdef CONFIG_QUOTA
int s_jquota_fmt;
-   char *s_qf_names[EXT4_MAXQUOTAS];
+   char __rcu *s_qf_names[EXT4_MAXQUOTAS];
 #endif
 };
 


Re: [PATCH RFC 0/5] rcu doc updates for whatisRCU and checklist

2018-10-05 Thread Paul E. McKenney
On Fri, Oct 05, 2018 at 07:46:28PM -0400, Theodore Y. Ts'o wrote:
> On Fri, Oct 05, 2018 at 04:18:09PM -0700, Joel Fernandes (Google) wrote:
> > 
> > Here are this week's rcu doc updates based on combing through whatisRCU and
> > checklists. Hopefully you agree with them. I left several old _bh and _sched
> > API references as is, since I don't think its a good idea to remove them 
> > till
> > the APIs themselves are removed, however I did remove several of them as 
> > well
> > (like in the first patch in this series) since I feel its better to 
> > "encourage"
> > new users not to use the old API.
> 
> Hi Joel,
> 
> As it so happens, I just recently wrote my first RCU patch[1] (file
> systems, especially on-disk data structures, generally tend not to be
> good candidates for RCU semantics).
> 
> [1] http://patchwork.ozlabs.org/patch/979779/

Very cool!

One question...  In the following hunk:



@@ -5353,9 +5362,13 @@  static int ext4_remount(struct super_block *sb, int 
*flags, char *data)
 #ifdef CONFIG_QUOTA
sbi->s_jquota_fmt = old_opts.s_jquota_fmt;
for (i = 0; i < EXT4_MAXQUOTAS; i++) {
-   kfree(sbi->s_qf_names[i]);
-   sbi->s_qf_names[i] = old_opts.s_qf_names[i];
+   to_free[i] = rcu_dereference_protected(sbi->s_qf_names[i],
+  &sb->s_umount);
+   rcu_assign_pointer(sbi->s_qf_names[i], old_opts.s_qf_names[i]);
}
+   for (i = 0; i < EXT4_MAXQUOTAS; i++)
+   kfree(to_free[i]);
+   synchronize_rcu();
 #endif
kfree(orig_data);
return err;



Shouldn't the synchronize_rcu() precede the loop doing the kfree()
calls?  Or am I missing something subtle?

Otherwise, looks good!  I was worried that seq_show_option() might
sleep, but it looks like it is just putting characters into an
array.  If there is lingering concern, CONFIG_PROVE_LOCKING will
usually catch that sort of thing.

Thanx, Paul

> So if you are working on improving RCU documentation, I thought I
> would give two comments on the RCU docs from the perspective of a
> developer trying to use RCU for the first time.
> 
> * whatisRCU is great, but one the example in Section 3 uses
>   rcu_dereference_protected() without explaining it.  Given that using
>   that function seems to be considered best practice, maybe a few more
>   words there would be in order?  That function isn't mentioned in
>   rcu.txt either, BTW.
> 
> * lockdep.txt *does* explain what rcu_dereference_protected() does,
>   but it doesn't really describe lockdep_is_held().  You can mostly
>   figure it out from context, but it wasn't obvious to me what locks
>   it could be used against, and in the case of a rw_semaphore, whether
>   it applied to shared as well as exclusive locks.  That's a lockdep
>   abstraction, and not a RCU abstraction, but lockdep isn't
>   particularly well documented, so I ended up spending 20-30 minutes
>   or so looking at the lockdep implementation before I was sure it
>   actually worked the way I thought it was going to.
> 
> Anyway, I was going to put submitting a patch to improve whatisRCU on
> my (vastly over-long) TODO list, but when I saw your patch set, I
> couldn't resist trying to see if I could fob it off on you.  If you
> don't think that's fair (and it probably isn't really), just let me
> know, and I'll put it back on my todo list.  :-)
> 
> Cheers,
> 
>   - Ted
> 



Re: [GIT PULL linux-next] Add Compiler Attributes (v6) tree

2018-10-05 Thread Stephen Rothwell
Hi Miguel,

On Fri, 5 Oct 2018 13:32:25 +0200 Miguel Ojeda 
 wrote:
>
> As discussed, here it is the Compiler Attributes series for
> linux-next. This time the original v6, based on -rc6.
> 
> The changes w.r.t. v5:
> 
>   - Added latest Reviewed-by's and Tested-by's.
> 
> The conflicts are trivial to solve, but if you want a reference, take
> a look at (rebased on top of next-20181005):
> 
>   https://github.com/ojeda/linux.git compiler-attributes-rebased
> 
> Thanks!
> 
> Cheers,
> Miguel
> 
> The following changes since commit 17b57b1883c1285f3d0dc2266e8f79286a7bef38:
> 
>   Linux 4.19-rc6 (2018-09-30 07:15:35 -0700)
> 
> are available in the Git repository at:
> 
>   https://github.com/ojeda/linux.git compiler-attributes

Thanks for adding your subsystem tree as a participant of linux-next.  As
you may know, this is not a judgement of your code.  The purpose of
linux-next is for integration testing and to lower the impact of
conflicts between subsystems in the next merge window. 

You will need to ensure that the patches/commits in your tree/series have
been:
 * submitted under GPL v2 (or later) and include the Contributor's
Signed-off-by,
 * posted to the relevant mailing list,
 * reviewed by you (or another maintainer of your subsystem tree),
 * successfully unit tested, and 
 * destined for the current or next Linux merge window.

Basically, this should be just what you would send to Linus (or ask him
to fetch).  It is allowed to be rebased if you deem it necessary.

Just in case you are unaware, I will fetch that branch every morning
and the merge it into linux-next, so any updates will be added without
you having to send a new pull request.  I will send you an email if I
find a problem.
-- 
Cheers,
Stephen Rothwell 
s...@canb.auug.org.au


pgpVxJuptBkoh.pgp
Description: OpenPGP digital signature


Re: [PATCH RFC 0/5] rcu doc updates for whatisRCU and checklist

2018-10-05 Thread Joel Fernandes
On Fri, Oct 05, 2018 at 05:13:26PM -0700, Paul E. McKenney wrote:
> On Fri, Oct 05, 2018 at 04:18:09PM -0700, Joel Fernandes (Google) wrote:
> > Hi Paul,
> > 
> > Here are this week's rcu doc updates based on combing through whatisRCU and
> > checklists. Hopefully you agree with them. I left several old _bh and _sched
> > API references as is, since I don't think its a good idea to remove them 
> > till
> > the APIs themselves are removed, however I did remove several of them as 
> > well
> > (like in the first patch in this series) since I feel its better to 
> > "encourage"
> > new users not to use the old API.
> > 
> > Also do you think it makes sense for us to write coccinelle patches to 
> > check if
> > folks use them on new patches? Btw, I am new to coccinelle but I'd love to 
> > give
> > it a try, it looks exciting. I remember you saying you wanted to do that, so
> > that's something else you could potentially offload to me as you see fit ;-)
> > 
> > Thank you very much!
> 
> Good catches, applied and pushed, thank you!
> 
> I updated the commit log and a couple of the patches a bit, so could
> you please double-check them?

Yes I checked and it looks good, thanks!

 - Joel



Re: [PATCH RFC 0/5] rcu doc updates for whatisRCU and checklist

2018-10-05 Thread Joel Fernandes
On Fri, Oct 05, 2018 at 07:46:28PM -0400, Theodore Y. Ts'o wrote:
> On Fri, Oct 05, 2018 at 04:18:09PM -0700, Joel Fernandes (Google) wrote:
> > 
> > Here are this week's rcu doc updates based on combing through whatisRCU and
> > checklists. Hopefully you agree with them. I left several old _bh and _sched
> > API references as is, since I don't think its a good idea to remove them 
> > till
> > the APIs themselves are removed, however I did remove several of them as 
> > well
> > (like in the first patch in this series) since I feel its better to 
> > "encourage"
> > new users not to use the old API.
> 
> Hi Joel,

Hi Ted,

> 
> As it so happens, I just recently wrote my first RCU patch[1] (file
> systems, especially on-disk data structures, generally tend not to be
> good candidates for RCU semantics).
> 
> [1] http://patchwork.ozlabs.org/patch/979779/
> 
> So if you are working on improving RCU documentation, I thought I
> would give two comments on the RCU docs from the perspective of a
> developer trying to use RCU for the first time.
> 
> * whatisRCU is great, but one the example in Section 3 uses
>   rcu_dereference_protected() without explaining it.  Given that using
>   that function seems to be considered best practice, maybe a few more
>   words there would be in order?  That function isn't mentioned in
>   rcu.txt either, BTW.

I actually felt the same about rcu_dereference_protected while reading and
then looked at the comment above the implementation. The code comments are
pretty detailed, but I agree the example should mention a few words about it
since it uses it. I could look into improving that, no problem.

> * lockdep.txt *does* explain what rcu_dereference_protected() does,
>   but it doesn't really describe lockdep_is_held().  You can mostly
>   figure it out from context, but it wasn't obvious to me what locks
>   it could be used against, and in the case of a rw_semaphore, whether
>   it applied to shared as well as exclusive locks.  That's a lockdep
>   abstraction, and not a RCU abstraction, but lockdep isn't
>   particularly well documented, so I ended up spending 20-30 minutes
>   or so looking at the lockdep implementation before I was sure it
>   actually worked the way I thought it was going to.

Ok, makes sense to improve it. Since I haven't yet looked through lockdep.txt
yet (as a part of my broader documentation effort for the RCU consolidation),
I can take up improving that based on your suggestions since I have to look
into it anyway :). I'll look into that next week and CC you on this.

> Anyway, I was going to put submitting a patch to improve whatisRCU on
> my (vastly over-long) TODO list, but when I saw your patch set, I
> couldn't resist trying to see if I could fob it off on you.  If you
> don't think that's fair (and it probably isn't really), just let me
> know, and I'll put it back on my todo list.  :-)

Its Ok :) I'm happy to help! thanks for letting me know.

Best,

 - Joel



Re: [PATCH v2 00/22] xfs-4.20: major documentation surgery

2018-10-05 Thread Dave Chinner
On Fri, Oct 05, 2018 at 07:01:20PM -0600, Jonathan Corbet wrote:
> On Sat, 6 Oct 2018 10:51:54 +1000
> Dave Chinner  wrote:
> 
> > Can you let us know whether the CC-by-SA 4.0 license is acceptible
> > or not? That's really the only thing that we need clarified at this
> > point - if it's OK I'll to pull this into the XFS tree for the 4.20
> > merge window. If not, we'll go back to the drawing board
> 
> I remain pretty concerned about it, to tell the truth.  Rather than
> continue to guess, though, I've called for help, and will be talking with
> the LF lawyer about this next Thursday.  Before then, I can't say anything
> except "I don't think this works..."
> 
> Will let you know what I hear.

Thanks for the update, Jon. I'll put this on the backburner until I
hear back from you.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


Re: [PATCH v2 00/22] xfs-4.20: major documentation surgery

2018-10-05 Thread Jonathan Corbet
On Sat, 6 Oct 2018 10:51:54 +1000
Dave Chinner  wrote:

> Can you let us know whether the CC-by-SA 4.0 license is acceptible
> or not? That's really the only thing that we need clarified at this
> point - if it's OK I'll to pull this into the XFS tree for the 4.20
> merge window. If not, we'll go back to the drawing board

I remain pretty concerned about it, to tell the truth.  Rather than
continue to guess, though, I've called for help, and will be talking with
the LF lawyer about this next Thursday.  Before then, I can't say anything
except "I don't think this works..."

Will let you know what I hear.

jon


Re: [PATCH v2 00/22] xfs-4.20: major documentation surgery

2018-10-05 Thread Dave Chinner
On Wed, Oct 03, 2018 at 09:18:11PM -0700, Darrick J. Wong wrote:
> Hi all,
> 
> This series converts the existing in-kernel xfs documentation to rst
> format, links it in with the rest of the kernel's rst documetation, and
> then begins pulling in the contents of the Data Structures & Algorithms
> book from the xfs-documentation git tree.  No changes are made to the
> text during the import process except to fix things that the conversion
> process (asciidoctor + pandoc) didn't do correctly.  The goal of this
> series is to tie together the XFS code with the on-disk format
> documentation for the features supported by the code.
> 
> I've built the docs and put them here, in case you hate reading rst:
> https://djwong.org/docs/kdoc/admin-guide/xfs.html
> https://djwong.org/docs/kdoc/filesystems/xfs-data-structures/index.html
> 
> I've posted a branch here because the png import patch is huge:
> https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=docs-4.20-merge
> 
> The patchset should apply cleanly against 4.19-rc6.  Comments and
> questions are, as always, welcome.

Jon,

Can you let us know whether the CC-by-SA 4.0 license is acceptible
or not? That's really the only thing that we need clarified at this
point - if it's OK I'll to pull this into the XFS tree for the 4.20
merge window. If not, we'll go back to the drawing board

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


Re: [PATCH] docs: improve readability for people with poorer eyesight

2018-10-05 Thread Dave Chinner
On Thu, Oct 04, 2018 at 06:06:03PM -0700, Darrick J. Wong wrote:
> Hi,
> 
> So my eyesight still hasn't fully recovered, so in the meantime it's
> been difficult to read the online documentation.  Here's some stylesheet
> overrides I've been using to make it easier for me to read them:
> https://djwong.org/docs/kdoc/index.html
> 
> ---
> From: Darrick J. Wong 
> 
> My eyesight is not in good shape, which means that I have difficulty
> reading the online Linux documentation.  Specifically, body text is
> oddly small compared to list items and the contrast of various text
> elements is too low for me to be able to see easily.
> 
> Therefore, alter the HTML theme overrides to make the text larger and
> increase the contrast for better visibility, and trust the typeface
> choices of the reader's browser.
> 
> For the PDF output, increase the text size, use a sans-serif typeface
> for sans-serif text, and use a serif typeface for "roman" serif text.
> 
> Signed-off-by: Darrick J. Wong 

This fixes problems I noticed when trying to review the built html
documentation - the inconsistent font sizes on my high-dpi monitor
made it almost impossible to read even though I have no eyesight
problems

Acked-by: Dave Chinner 

-Dave.
-- 
Dave Chinner
da...@fromorbit.com


Re: [PATCH RFC 0/5] rcu doc updates for whatisRCU and checklist

2018-10-05 Thread Paul E. McKenney
On Fri, Oct 05, 2018 at 04:18:09PM -0700, Joel Fernandes (Google) wrote:
> Hi Paul,
> 
> Here are this week's rcu doc updates based on combing through whatisRCU and
> checklists. Hopefully you agree with them. I left several old _bh and _sched
> API references as is, since I don't think its a good idea to remove them till
> the APIs themselves are removed, however I did remove several of them as well
> (like in the first patch in this series) since I feel its better to 
> "encourage"
> new users not to use the old API.
> 
> Also do you think it makes sense for us to write coccinelle patches to check 
> if
> folks use them on new patches? Btw, I am new to coccinelle but I'd love to 
> give
> it a try, it looks exciting. I remember you saying you wanted to do that, so
> that's something else you could potentially offload to me as you see fit ;-)
> 
> Thank you very much!

Good catches, applied and pushed, thank you!

I updated the commit log and a couple of the patches a bit, so could
you please double-check them?

Thanx, Paul

> Joel Fernandes (Google) (5):
>   doc: rcu: Update core and full API in whatisRCU
>   doc: rcu: Add more rationale for using rcu_read_lock_sched in
> checklist
>   doc: rcu: Remove obsolete suggestion from checklist
>   doc: rcu: Remove obsolete checklist item about synchronize_rcu usage
>   doc: rcu: Encourage use of rcu_barrier in checklist
> 
>  Documentation/RCU/checklist.txt | 49 +++--
>  Documentation/RCU/whatisRCU.txt | 55 +
>  2 files changed, 39 insertions(+), 65 deletions(-)
> 
> -- 
> 2.19.0.605.g01d371f741-goog
> 



Re: [PATCH 2/2] docs: promote the ext4 data structures book to top level

2018-10-05 Thread Theodore Y. Ts'o
On Thu, Oct 04, 2018 at 05:59:44PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong 
> 
> Move the ext4 data structures book to Documentation/filesystems/ext4/
> since the administrative information moved elsewhere.
> 
> Signed-off-by: Darrick J. Wong 

Thanks, applied and pushed out to the ext4.git tree.

Randy, Jon: the original patch didn't make it past vger.kernel.org
because it was too large (it was moving a lot of files around).  It
looked fine to me, but if you want to take a look it should be on the
dev branch of the ext4.git tree.

- Ted


Re: [PATCH RFC 0/5] rcu doc updates for whatisRCU and checklist

2018-10-05 Thread Theodore Y. Ts'o
On Fri, Oct 05, 2018 at 04:18:09PM -0700, Joel Fernandes (Google) wrote:
> 
> Here are this week's rcu doc updates based on combing through whatisRCU and
> checklists. Hopefully you agree with them. I left several old _bh and _sched
> API references as is, since I don't think its a good idea to remove them till
> the APIs themselves are removed, however I did remove several of them as well
> (like in the first patch in this series) since I feel its better to 
> "encourage"
> new users not to use the old API.

Hi Joel,

As it so happens, I just recently wrote my first RCU patch[1] (file
systems, especially on-disk data structures, generally tend not to be
good candidates for RCU semantics).

[1] http://patchwork.ozlabs.org/patch/979779/

So if you are working on improving RCU documentation, I thought I
would give two comments on the RCU docs from the perspective of a
developer trying to use RCU for the first time.

* whatisRCU is great, but one the example in Section 3 uses
  rcu_dereference_protected() without explaining it.  Given that using
  that function seems to be considered best practice, maybe a few more
  words there would be in order?  That function isn't mentioned in
  rcu.txt either, BTW.

* lockdep.txt *does* explain what rcu_dereference_protected() does,
  but it doesn't really describe lockdep_is_held().  You can mostly
  figure it out from context, but it wasn't obvious to me what locks
  it could be used against, and in the case of a rw_semaphore, whether
  it applied to shared as well as exclusive locks.  That's a lockdep
  abstraction, and not a RCU abstraction, but lockdep isn't
  particularly well documented, so I ended up spending 20-30 minutes
  or so looking at the lockdep implementation before I was sure it
  actually worked the way I thought it was going to.

Anyway, I was going to put submitting a patch to improve whatisRCU on
my (vastly over-long) TODO list, but when I saw your patch set, I
couldn't resist trying to see if I could fob it off on you.  If you
don't think that's fair (and it probably isn't really), just let me
know, and I'll put it back on my todo list.  :-)

Cheers,

- Ted


[PATCH RFC 4/5] doc: rcu: Remove obsolete checklist item about synchronize_rcu usage

2018-10-05 Thread Joel Fernandes (Google)
Since the RCU mechanisms have been consolidated, this checklist item
seems no longer useful or relevant. Probably even a bit misleading. For
example, synchronize_rcu will now guarantee that all interrupt disabled
regions have finished executing. So lets remove this checklist item.

Signed-off-by: Joel Fernandes (Google) 
---
 Documentation/RCU/checklist.txt | 37 +++--
 1 file changed, 7 insertions(+), 30 deletions(-)

diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt
index cc22ce49618d..b90ad1b0665a 100644
--- a/Documentation/RCU/checklist.txt
+++ b/Documentation/RCU/checklist.txt
@@ -320,37 +320,14 @@ over a rather long period of time, but improvements are 
always welcome!
will break Alpha, cause aggressive compilers to generate bad code,
and confuse people trying to read your code.
 
-11.Note that synchronize_rcu() -only- guarantees to wait until
-   all currently executing rcu_read_lock()-protected RCU read-side
-   critical sections complete.  It does -not- necessarily guarantee
-   that all currently running interrupts, NMIs, preempt_disable()
-   code, or idle loops will complete.  Therefore, if your
-   read-side critical sections are protected by something other
-   than rcu_read_lock(), do -not- use synchronize_rcu().
-
-   Similarly, disabling preemption is not an acceptable substitute
-   for rcu_read_lock().  Code that attempts to use preemption
-   disabling where it should be using rcu_read_lock() will break
-   in CONFIG_PREEMPT=y kernel builds.
-
-   If you want to wait for interrupt handlers, NMI handlers, and
-   code under the influence of preempt_disable(), you instead
-   need to use synchronize_irq() or synchronize_sched().
-
-   This same limitation also applies to synchronize_rcu_bh()
-   and synchronize_srcu(), as well as to the asynchronous and
-   expedited forms of the three primitives, namely call_rcu(),
-   call_rcu_bh(), call_srcu(), synchronize_rcu_expedited(),
-   synchronize_rcu_bh_expedited(), and synchronize_srcu_expedited().
-
-12.Any lock acquired by an RCU callback must be acquired elsewhere
+11.Any lock acquired by an RCU callback must be acquired elsewhere
with softirq disabled, e.g., via spin_lock_irqsave(),
spin_lock_bh(), etc.  Failing to disable irq on a given
acquisition of that lock will result in deadlock as soon as
the RCU softirq handler happens to run your RCU callback while
interrupting that acquisition's critical section.
 
-13.RCU callbacks can be and are executed in parallel.  In many cases,
+12.RCU callbacks can be and are executed in parallel.  In many cases,
the callback code simply wrappers around kfree(), so that this
is not an issue (or, more accurately, to the extent that it is
an issue, the memory-allocator locking handles it).  However,
@@ -366,7 +343,7 @@ over a rather long period of time, but improvements are 
always welcome!
not the case, a self-spawning RCU callback would prevent the
victim CPU from ever going offline.)
 
-14.Unlike other forms of RCU, it -is- permissible to block in an
+13.Unlike other forms of RCU, it -is- permissible to block in an
SRCU read-side critical section (demarked by srcu_read_lock()
and srcu_read_unlock()), hence the "SRCU": "sleepable RCU".
Please note that if you don't need to sleep in read-side critical
@@ -410,7 +387,7 @@ over a rather long period of time, but improvements are 
always welcome!
Note that rcu_dereference() and rcu_assign_pointer() relate to
SRCU just as they do to other forms of RCU.
 
-15.The whole point of call_rcu(), synchronize_rcu(), and friends
+14.The whole point of call_rcu(), synchronize_rcu(), and friends
is to wait until all pre-existing readers have finished before
carrying out some otherwise-destructive operation.  It is
therefore critically important to -first- remove any path
@@ -422,13 +399,13 @@ over a rather long period of time, but improvements are 
always welcome!
is the caller's responsibility to guarantee that any subsequent
readers will execute safely.
 
-16.The various RCU read-side primitives do -not- necessarily contain
+15.The various RCU read-side primitives do -not- necessarily contain
memory barriers.  You should therefore plan for the CPU
and the compiler to freely reorder code into and out of RCU
read-side critical sections.  It is the responsibility of the
RCU update-side primitives to deal with this.
 
-17.Use CONFIG_PROVE_LOCKING, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and the
+16.Use CONFIG_PROVE_LOCKING, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and the
__rcu sparse checks to validate your RCU code.  These can help
find problems as follows:
 
@@ -451,7 +428,7 @@ over a rather 

[PATCH RFC 5/5] doc: rcu: Encourage use of rcu_barrier in checklist

2018-10-05 Thread Joel Fernandes (Google)
The checklist suggests rcu_barrier_bh for RCU-bh and similarly for
sched, however these APIs are now implemented as rcu_barrier itself due
to the RCU consolidation. It may also be removed in the future, so lets
correct the doc to encourage use of the rcu_barrier() API itself.
Similar changes are made in previous patches.

Signed-off-by: Joel Fernandes (Google) 
---
 Documentation/RCU/checklist.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt
index b90ad1b0665a..6f469864d9f5 100644
--- a/Documentation/RCU/checklist.txt
+++ b/Documentation/RCU/checklist.txt
@@ -442,8 +442,8 @@ over a rather long period of time, but improvements are 
always welcome!
You instead need to use one of the barrier functions:
 
o   call_rcu() -> rcu_barrier()
-   o   call_rcu_bh() -> rcu_barrier_bh()
-   o   call_rcu_sched() -> rcu_barrier_sched()
+   o   call_rcu_bh() -> rcu_barrier()
+   o   call_rcu_sched() -> rcu_barrier()
o   call_srcu() -> srcu_barrier()
 
However, these barrier functions are absolutely -not- guaranteed
-- 
2.19.0.605.g01d371f741-goog



[PATCH RFC 3/5] doc: rcu: Remove obsolete suggestion from checklist

2018-10-05 Thread Joel Fernandes (Google)
call_rcu_bh is now implemented in terms of call_rcu, so the suggestion
to use a different API for speed benefits is not accurate anymore.
Update the document accordingly.

Signed-off-by: Joel Fernandes (Google) 
---
 Documentation/RCU/checklist.txt | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt
index 8860ab2a897a..cc22ce49618d 100644
--- a/Documentation/RCU/checklist.txt
+++ b/Documentation/RCU/checklist.txt
@@ -285,11 +285,7 @@ over a rather long period of time, but improvements are 
always welcome!
here is that superuser already has lots of ways to crash
the machine.
 
-   d.  Use call_rcu_bh() rather than call_rcu(), in order to take
-   advantage of call_rcu_bh()'s faster grace periods.  (This
-   is only a partial solution, though.)
-
-   e.  Periodically invoke synchronize_rcu(), permitting a limited
+   d.  Periodically invoke synchronize_rcu(), permitting a limited
number of updates per grace period.
 
The same cautions apply to call_rcu_bh(), call_rcu_sched(),
-- 
2.19.0.605.g01d371f741-goog



[PATCH RFC 1/5] doc: rcu: Update core and full API in whatisRCU

2018-10-05 Thread Joel Fernandes (Google)
RCU consolidation effort causes the update side of the RCU API to be
consistent across all the 3 RCU flavors (normal, sched, bh). Update the
full API in the whatisRCU document accordingly so we are encouraging
folks to use the consolidated API than using the ones for the individual
flavors (which if I understand correcty, could be removed in the
future).

Also rcu_dereference is documented to be the same for all 3 mechanisms
(even before the consolidation), however its actually different - as
using the right rcu_dereference primitive (such as rcu_dereference_bh
for bh) is needed to make lock debugging work correctly. This update
also corrects that.

Also, add local_bh_disable() and local_bh_enable() as softirq
protection primitives and correct a grammar error in a quiz answer.

Signed-off-by: Joel Fernandes (Google) 
---
 Documentation/RCU/whatisRCU.txt | 55 +
 1 file changed, 28 insertions(+), 27 deletions(-)

diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
index 86d82f7f3500..2e8d1c0a824f 100644
--- a/Documentation/RCU/whatisRCU.txt
+++ b/Documentation/RCU/whatisRCU.txt
@@ -322,28 +322,27 @@ to their callers and (2) call_rcu() callbacks may be 
invoked.  Efficient
 implementations of the RCU infrastructure make heavy use of batching in
 order to amortize their overhead over many uses of the corresponding APIs.
 
-There are no fewer than three RCU mechanisms in the Linux kernel; the
-diagram above shows the first one, which is by far the most commonly used.
-The rcu_dereference() and rcu_assign_pointer() primitives are used for
-all three mechanisms, but different defer and protect primitives are
-used as follows:
+There are atleast three flavors of RCU usage in the Linux kernel. The diagram
+above shows the most common one. On the updater side, the rcu_assign_pointer(),
+sychronize_rcu() and call_rcu() primitives used are the same for all three
+flavors. However for protection (on the reader side), the primitives used vary
+depending on the flavor:
 
-   Defer   Protect
+a. rcu_read_lock() / rcu_read_unlock()
+   rcu_dereference()
 
-a. synchronize_rcu()   rcu_read_lock() / rcu_read_unlock()
-   call_rcu()  rcu_dereference()
+b. rcu_read_lock_bh() / rcu_read_unlock_bh()
+   local_bh_disable() / local_bh_enable()
+   rcu_dereference_bh()
 
-b. synchronize_rcu_bh()rcu_read_lock_bh() / rcu_read_unlock_bh()
-   call_rcu_bh()   rcu_dereference_bh()
+c. rcu_read_lock_sched() / rcu_read_unlock_sched()
+   preempt_disable() / preempt_enable()
+   local_irq_save() / local_irq_restore()
+   hardirq enter / hardirq exit
+   NMI enter / NMI exit
+   rcu_dereference_sched()
 
-c. synchronize_sched() rcu_read_lock_sched() / rcu_read_unlock_sched()
-   call_rcu_sched()preempt_disable() / preempt_enable()
-   local_irq_save() / local_irq_restore()
-   hardirq enter / hardirq exit
-   NMI enter / NMI exit
-   rcu_dereference_sched()
-
-These three mechanisms are used as follows:
+These three flavors are used as follows:
 
 a. RCU applied to normal data structures.
 
@@ -867,18 +866,20 @@ RCU:  Critical sections   Grace period
Barrier
 
 bh:Critical sections   Grace periodBarrier
 
-   rcu_read_lock_bhcall_rcu_bh rcu_barrier_bh
-   rcu_read_unlock_bh  synchronize_rcu_bh
-   rcu_dereference_bh  synchronize_rcu_bh_expedited
+   rcu_read_lock_bhcall_rcurcu_barrier
+   rcu_read_unlock_bh  synchronize_rcu
+   [local_bh_disable]  synchronize_rcu_expedited
+   [and friends]
+   rcu_dereference_bh
rcu_dereference_bh_check
rcu_dereference_bh_protected
rcu_read_lock_bh_held
 
 sched: Critical sections   Grace periodBarrier
 
-   rcu_read_lock_sched synchronize_sched   rcu_barrier_sched
-   rcu_read_unlock_sched   call_rcu_sched
-   [preempt_disable]   synchronize_sched_expedited
+   rcu_read_lock_sched call_rcurcu_barrier
+   rcu_read_unlock_sched   synchronize_rcu
+   [preempt_disable]   synchronize_rcu_expedited
[and friends]
rcu_read_lock_sched_notrace
rcu_read_unlock_sched_notrace
@@ -890,8 +891,8 @@ sched:  Critical sections   Grace period
Barrier
 
 SRCU:  Critical sections   Grace periodBarrier
 
-   srcu_read_lock  synchronize_srcusrcu_barrier
-   srcu_read_unlockcall_srcu
+   srcu_read_lock  call_srcu   srcu_barrier
+   srcu_read_unlocksynchronize_srcu
srcu_dereferencesynchronize_srcu_expedited
srcu_dereference_check
srcu_read_lock_held

[PATCH RFC 2/5] doc: rcu: Add more rationale for using rcu_read_lock_sched in checklist

2018-10-05 Thread Joel Fernandes (Google)
It could be clarified better why rcu_read_lock_sched is better than
using preempt_disable, add the same.

Signed-off-by: Joel Fernandes (Google) 
---
 Documentation/RCU/checklist.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt
index 49747717d905..8860ab2a897a 100644
--- a/Documentation/RCU/checklist.txt
+++ b/Documentation/RCU/checklist.txt
@@ -63,7 +63,7 @@ over a rather long period of time, but improvements are 
always welcome!
pointer must be covered by rcu_read_lock(), rcu_read_lock_bh(),
rcu_read_lock_sched(), or by the appropriate update-side lock.
Disabling of preemption can serve as rcu_read_lock_sched(), but
-   is less readable.
+   is less readable and prevents lockdep from detecting locking issues.
 
Letting RCU-protected pointers "leak" out of an RCU read-side
critical section is every bid as bad as letting them leak out
-- 
2.19.0.605.g01d371f741-goog



[PATCH RFC 0/5] rcu doc updates for whatisRCU and checklist

2018-10-05 Thread Joel Fernandes (Google)
Hi Paul,

Here are this week's rcu doc updates based on combing through whatisRCU and
checklists. Hopefully you agree with them. I left several old _bh and _sched
API references as is, since I don't think its a good idea to remove them till
the APIs themselves are removed, however I did remove several of them as well
(like in the first patch in this series) since I feel its better to "encourage"
new users not to use the old API.

Also do you think it makes sense for us to write coccinelle patches to check if
folks use them on new patches? Btw, I am new to coccinelle but I'd love to give
it a try, it looks exciting. I remember you saying you wanted to do that, so
that's something else you could potentially offload to me as you see fit ;-)

Thank you very much!

Joel Fernandes (Google) (5):
  doc: rcu: Update core and full API in whatisRCU
  doc: rcu: Add more rationale for using rcu_read_lock_sched in
checklist
  doc: rcu: Remove obsolete suggestion from checklist
  doc: rcu: Remove obsolete checklist item about synchronize_rcu usage
  doc: rcu: Encourage use of rcu_barrier in checklist

 Documentation/RCU/checklist.txt | 49 +++--
 Documentation/RCU/whatisRCU.txt | 55 +
 2 files changed, 39 insertions(+), 65 deletions(-)

-- 
2.19.0.605.g01d371f741-goog


Re: [PATCH v9 1/2] dt-bindings: hwmon: Add ina3221 documentation

2018-10-05 Thread Rob Herring
On Mon,  1 Oct 2018 18:05:22 -0700, Nicolin Chen wrote:
> Texas Instruments INA3221 is a triple-channel shunt and bus
> voltage monitor. This patch adds a DT binding doc for it.
> 
> Signed-off-by: Nicolin Chen 
> ---
> Changelog
> v7->v9:
>  * N/A
> v6->v7:
>  * Restored three channel examples and merged them with the parent one
> v5->v6:
>  * Removed status property as no need to explicitly list it.
>  * Combined all examples into a complete one.
> v4->v5:
>  * Replaced "input-id" with "reg" and added address-cells and size-cells
>  * Replaced "input-label" with "label"
>  * Replaced "shunt-resistor" with "shunt-resistor-micro-ohms"
> v3->v4:
>  * Removed the attempt of putting labels in the node names
>  * Added a new optional label property in the child node
>  * Updated examples accordingly
> v2->v3:
>  * Added a simple subject in the line 1
>  * Fixed the shunt resistor value in the example
> v1->v2:
>  * Dropped channel name properties
>  * Added child node definitions.
>  * * Added shunt resistor property in the child node
>  * * Added status property to indicate connection status
>  * * Changed to use child node name as the label of input source
> 
>  .../devicetree/bindings/hwmon/ina3221.txt | 44 +++
>  1 file changed, 44 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/hwmon/ina3221.txt
> 

Reviewed-by: Rob Herring 


Re: [RFC PATCH v4 3/9] x86/cet/ibt: Add IBT legacy code bitmap allocation function

2018-10-05 Thread Eugene Syromiatnikov
On Fri, Oct 05, 2018 at 10:07:46AM -0700, Andy Lutomirski wrote:
> On Fri, Oct 5, 2018 at 10:03 AM Yu-cheng Yu  wrote:
> >
> > On Fri, 2018-10-05 at 09:28 -0700, Andy Lutomirski wrote:
> > > > On Oct 5, 2018, at 9:13 AM, Yu-cheng Yu  wrote:
> > > >
> > > > > On Wed, 2018-10-03 at 21:57 +0200, Eugene Syromiatnikov wrote:
> > > > > > On Fri, Sep 21, 2018 at 08:05:47AM -0700, Yu-cheng Yu wrote:
> > > > > > Indirect branch tracking provides an optional legacy code bitmap
> > > > > > that indicates locations of non-IBT compatible code.  When set,
> > > > > > each bit in the bitmap represents a page in the linear address is
> > > > > > legacy code.
> > > > > >
> > > > > > We allocate the bitmap only when the application requests it.
> > > > > > Most applications do not need the bitmap.
> > > > > >
> > > > > > Signed-off-by: Yu-cheng Yu 
> > > > > > ---
> > > > > > arch/x86/kernel/cet.c | 45 
> > > > > > +++
> > > > > > 1 file changed, 45 insertions(+)
> > > > > >
> > > > > > diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c
> > > > > > index 6adfe795d692..a65d9745af08 100644
> > > > > > --- a/arch/x86/kernel/cet.c
> > > > > > +++ b/arch/x86/kernel/cet.c
> > > > > > @@ -314,3 +314,48 @@ void cet_disable_ibt(void)
> > > > > >wrmsrl(MSR_IA32_U_CET, r);
> > > > > >current->thread.cet.ibt_enabled = 0;
> > > > > > }
> > > > > > +
> > > > > > +int cet_setup_ibt_bitmap(void)
> > > > > > +{
> > > > > > +u64 r;
> > > > > > +unsigned long bitmap;
> > > > > > +unsigned long size;
> > > > > > +
> > > > > > +if (!cpu_feature_enabled(X86_FEATURE_IBT))
> > > > > > +return -EOPNOTSUPP;
> > > > > > +
> > > > > > +if (!current->thread.cet.ibt_bitmap_addr) {
> > > > > > +/*
> > > > > > + * Calculate size and put in thread header.
> > > > > > + * may_expand_vm() needs this information.
> > > > > > + */
> > > > > > +size = TASK_SIZE / PAGE_SIZE / BITS_PER_BYTE;
> > > > >
> > > > > TASK_SIZE_MAX is likely needed here, as an application can easily 
> > > > > switch
> > > > > between long an 32-bit protected mode.  And then the case of a CPU 
> > > > > that
> > > > > doesn't support 5LPT.
> > > >
> > > > If we had calculated bitmap size from TASK_SIZE_MAX, all 32-bit apps 
> > > > would
> > > > have
> > > > failed the allocation for bitmap size > TASK_SIZE.  Please see values 
> > > > below,
> > > > which is printed from the current code.
> > > >
> > > > Yu-cheng
> > > >
> > > >
> > > > x64:
> > > > TASK_SIZE_MAX=  7fff  f000
> > > > TASK_SIZE=  7fff  f000
> > > > bitmap size=    
> > > >
> > > > x32:
> > > > TASK_SIZE_MAX=  7fff  f000
> > > > TASK_SIZE=    e000
> > > > bitmap size=   0001 
> > > >
> > >
> > > I haven’t followed all the details here, but I have a general policy of
> > > objecting to any new use of TASK_SIZE. If you really really need to 
> > > depend on
> > > 32-bitness in new code, please figure out what exactly you mean by 
> > > “32-bit”
> > > and use an explicit check.
> >
> > The explicit check would be:
> >
> > test_thread_flag(TIF_ADDR32) ? IA32_PAGE_OFFSET : TASK_SIZE_MAX
> >
> > which is the same as TASK_SIZE.
> 
> But this is only ever done in response to a syscall, right?  So
> wouldn't in_compat_syscall() be the right check?
> 
> Also, this whole thing makes me extremely nervous.  The MSR only
> contains the start address, not the size, right?  So what prevents
> some goof from causing the CPU to read way past the end of the bitmap
> if the bitmap is short because the kernel thought it was supposed to
> be 32-bit?

That's what I've mentioned initially: every syscall made with int 0x80
is interpreted as compat, even if it was made from long mode.

> I'm inclined to suggest something awful-ish: always allocate the
> bitmap as though it's for a 64-bit process, and just let it be at a
> high address.  And add a syscall or arch_prctl() to manipulate it for
> the benefit of 32-bit programs that can't address it directly.

That's likely the only way to go.


Re: [RFC PATCH v4 3/9] x86/cet/ibt: Add IBT legacy code bitmap allocation function

2018-10-05 Thread Andy Lutomirski
On Fri, Oct 5, 2018 at 10:03 AM Yu-cheng Yu  wrote:
>
> On Fri, 2018-10-05 at 09:28 -0700, Andy Lutomirski wrote:
> > > On Oct 5, 2018, at 9:13 AM, Yu-cheng Yu  wrote:
> > >
> > > > On Wed, 2018-10-03 at 21:57 +0200, Eugene Syromiatnikov wrote:
> > > > > On Fri, Sep 21, 2018 at 08:05:47AM -0700, Yu-cheng Yu wrote:
> > > > > Indirect branch tracking provides an optional legacy code bitmap
> > > > > that indicates locations of non-IBT compatible code.  When set,
> > > > > each bit in the bitmap represents a page in the linear address is
> > > > > legacy code.
> > > > >
> > > > > We allocate the bitmap only when the application requests it.
> > > > > Most applications do not need the bitmap.
> > > > >
> > > > > Signed-off-by: Yu-cheng Yu 
> > > > > ---
> > > > > arch/x86/kernel/cet.c | 45 +++
> > > > > 1 file changed, 45 insertions(+)
> > > > >
> > > > > diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c
> > > > > index 6adfe795d692..a65d9745af08 100644
> > > > > --- a/arch/x86/kernel/cet.c
> > > > > +++ b/arch/x86/kernel/cet.c
> > > > > @@ -314,3 +314,48 @@ void cet_disable_ibt(void)
> > > > >wrmsrl(MSR_IA32_U_CET, r);
> > > > >current->thread.cet.ibt_enabled = 0;
> > > > > }
> > > > > +
> > > > > +int cet_setup_ibt_bitmap(void)
> > > > > +{
> > > > > +u64 r;
> > > > > +unsigned long bitmap;
> > > > > +unsigned long size;
> > > > > +
> > > > > +if (!cpu_feature_enabled(X86_FEATURE_IBT))
> > > > > +return -EOPNOTSUPP;
> > > > > +
> > > > > +if (!current->thread.cet.ibt_bitmap_addr) {
> > > > > +/*
> > > > > + * Calculate size and put in thread header.
> > > > > + * may_expand_vm() needs this information.
> > > > > + */
> > > > > +size = TASK_SIZE / PAGE_SIZE / BITS_PER_BYTE;
> > > >
> > > > TASK_SIZE_MAX is likely needed here, as an application can easily switch
> > > > between long an 32-bit protected mode.  And then the case of a CPU that
> > > > doesn't support 5LPT.
> > >
> > > If we had calculated bitmap size from TASK_SIZE_MAX, all 32-bit apps would
> > > have
> > > failed the allocation for bitmap size > TASK_SIZE.  Please see values 
> > > below,
> > > which is printed from the current code.
> > >
> > > Yu-cheng
> > >
> > >
> > > x64:
> > > TASK_SIZE_MAX=  7fff  f000
> > > TASK_SIZE=  7fff  f000
> > > bitmap size=    
> > >
> > > x32:
> > > TASK_SIZE_MAX=  7fff  f000
> > > TASK_SIZE=    e000
> > > bitmap size=   0001 
> > >
> >
> > I haven’t followed all the details here, but I have a general policy of
> > objecting to any new use of TASK_SIZE. If you really really need to depend 
> > on
> > 32-bitness in new code, please figure out what exactly you mean by “32-bit”
> > and use an explicit check.
>
> The explicit check would be:
>
> test_thread_flag(TIF_ADDR32) ? IA32_PAGE_OFFSET : TASK_SIZE_MAX
>
> which is the same as TASK_SIZE.

But this is only ever done in response to a syscall, right?  So
wouldn't in_compat_syscall() be the right check?

Also, this whole thing makes me extremely nervous.  The MSR only
contains the start address, not the size, right?  So what prevents
some goof from causing the CPU to read way past the end of the bitmap
if the bitmap is short because the kernel thought it was supposed to
be 32-bit?

I'm inclined to suggest something awful-ish: always allocate the
bitmap as though it's for a 64-bit process, and just let it be at a
high address.  And add a syscall or arch_prctl() to manipulate it for
the benefit of 32-bit programs that can't address it directly.

>
> Or, do we want a new macro?
>
> #define IBT_BITMAP_SIZE (test_thread_flag(TIF_ADDR32) ? \
> (IA32_PAGE_OFFSET / PAGE_SIZE / BITS_PER_BYTE) : \
> (TASK_SIZE_MAX / PAGE_SIZE / BITS_PER_BYTE))

No.  I don't like hiding magic like this in a macro that looks like a constant.


Re: [RFC PATCH v4 3/9] x86/cet/ibt: Add IBT legacy code bitmap allocation function

2018-10-05 Thread Yu-cheng Yu
On Fri, 2018-10-05 at 09:28 -0700, Andy Lutomirski wrote:
> > On Oct 5, 2018, at 9:13 AM, Yu-cheng Yu  wrote:
> > 
> > > On Wed, 2018-10-03 at 21:57 +0200, Eugene Syromiatnikov wrote:
> > > > On Fri, Sep 21, 2018 at 08:05:47AM -0700, Yu-cheng Yu wrote:
> > > > Indirect branch tracking provides an optional legacy code bitmap
> > > > that indicates locations of non-IBT compatible code.  When set,
> > > > each bit in the bitmap represents a page in the linear address is
> > > > legacy code.
> > > > 
> > > > We allocate the bitmap only when the application requests it.
> > > > Most applications do not need the bitmap.
> > > > 
> > > > Signed-off-by: Yu-cheng Yu 
> > > > ---
> > > > arch/x86/kernel/cet.c | 45 +++
> > > > 1 file changed, 45 insertions(+)
> > > > 
> > > > diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c
> > > > index 6adfe795d692..a65d9745af08 100644
> > > > --- a/arch/x86/kernel/cet.c
> > > > +++ b/arch/x86/kernel/cet.c
> > > > @@ -314,3 +314,48 @@ void cet_disable_ibt(void)
> > > >wrmsrl(MSR_IA32_U_CET, r);
> > > >current->thread.cet.ibt_enabled = 0;
> > > > }
> > > > +
> > > > +int cet_setup_ibt_bitmap(void)
> > > > +{
> > > > +u64 r;
> > > > +unsigned long bitmap;
> > > > +unsigned long size;
> > > > +
> > > > +if (!cpu_feature_enabled(X86_FEATURE_IBT))
> > > > +return -EOPNOTSUPP;
> > > > +
> > > > +if (!current->thread.cet.ibt_bitmap_addr) {
> > > > +/*
> > > > + * Calculate size and put in thread header.
> > > > + * may_expand_vm() needs this information.
> > > > + */
> > > > +size = TASK_SIZE / PAGE_SIZE / BITS_PER_BYTE;
> > > 
> > > TASK_SIZE_MAX is likely needed here, as an application can easily switch
> > > between long an 32-bit protected mode.  And then the case of a CPU that
> > > doesn't support 5LPT.
> > 
> > If we had calculated bitmap size from TASK_SIZE_MAX, all 32-bit apps would
> > have
> > failed the allocation for bitmap size > TASK_SIZE.  Please see values below,
> > which is printed from the current code.
> > 
> > Yu-cheng
> > 
> > 
> > x64:
> > TASK_SIZE_MAX=  7fff  f000
> > TASK_SIZE=  7fff  f000
> > bitmap size=    
> > 
> > x32:
> > TASK_SIZE_MAX=  7fff  f000
> > TASK_SIZE=    e000
> > bitmap size=   0001 
> > 
> 
> I haven’t followed all the details here, but I have a general policy of
> objecting to any new use of TASK_SIZE. If you really really need to depend on
> 32-bitness in new code, please figure out what exactly you mean by “32-bit”
> and use an explicit check.

The explicit check would be:

test_thread_flag(TIF_ADDR32) ? IA32_PAGE_OFFSET : TASK_SIZE_MAX

which is the same as TASK_SIZE.

Or, do we want a new macro?

#define IBT_BITMAP_SIZE (test_thread_flag(TIF_ADDR32) ? \
(IA32_PAGE_OFFSET / PAGE_SIZE / BITS_PER_BYTE) : \
(TASK_SIZE_MAX / PAGE_SIZE / BITS_PER_BYTE))

Yu-cheng


[PATCH 07/36] kbuild: Add support for DT binding schema checks

2018-10-05 Thread Rob Herring
This adds the build infrastructure for checking DT binding schema
documents and validating dts files using the binding schema.

Check DT binding schema documents:
make dt_binding_check

Build dts files and check using DT binding schema:
make dtbs_check

Currently, the validation targets are separate from a normal build to
avoid a hard dependency on the external DT schema project and because
there are lots of warnings generated.

Cc: Jonathan Corbet 
Cc: Mark Rutland 
Cc: Masahiro Yamada 
Cc: Michal Marek 
Cc: linux-doc@vger.kernel.org
Cc: devicet...@vger.kernel.org
Cc: linux-kbu...@vger.kernel.org
Signed-off-by: Rob Herring 
---
 .gitignore   |  1 +
 Documentation/Makefile   |  2 +-
 Documentation/devicetree/bindings/.gitignore |  2 ++
 Documentation/devicetree/bindings/Makefile   | 30 
 Makefile |  8 +-
 scripts/Makefile.lib | 24 ++--
 6 files changed, 63 insertions(+), 4 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/.gitignore
 create mode 100644 Documentation/devicetree/bindings/Makefile

diff --git a/.gitignore b/.gitignore
index 97ba6b79834c..a20ac26aa2f5 100644
--- a/.gitignore
+++ b/.gitignore
@@ -15,6 +15,7 @@
 *.bin
 *.bz2
 *.c.[012]*.*
+*.dt.yaml
 *.dtb
 *.dtb.S
 *.dwo
diff --git a/Documentation/Makefile b/Documentation/Makefile
index 2ca77ad0f238..9786957c6a35 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -2,7 +2,7 @@
 # Makefile for Sphinx documentation
 #
 
-subdir-y :=
+subdir-y := devicetree/bindings/
 
 # You can set these variables from the command line.
 SPHINXBUILD   = sphinx-build
diff --git a/Documentation/devicetree/bindings/.gitignore 
b/Documentation/devicetree/bindings/.gitignore
new file mode 100644
index ..7e47316f1d7a
--- /dev/null
+++ b/Documentation/devicetree/bindings/.gitignore
@@ -0,0 +1,2 @@
+*.example.dts
+*.yaml.tmp
diff --git a/Documentation/devicetree/bindings/Makefile 
b/Documentation/devicetree/bindings/Makefile
new file mode 100644
index ..b57f0dec3fab
--- /dev/null
+++ b/Documentation/devicetree/bindings/Makefile
@@ -0,0 +1,30 @@
+# SPDX-License-Identifier: GPL-2.0
+DT_DOC_CHECKER ?= dt-doc-validate
+DT_EXTRACT_EX ?= dt-extract-example
+
+quiet_cmd_chk_binding = CHKDT   $<
+  cmd_chk_binding = (set -e; \
+ $(DT_DOC_CHECKER) $< ; \
+ mkdir -p $(dir $@) ; \
+ $(DT_EXTRACT_EX) $< > $@ )
+
+$(obj)/%.example.dts: $(src)/%.yaml FORCE
+   $(call if_changed,chk_binding)
+
+DT_MK_SCHEMA ?= dt-mk-schema
+DT_TMP_SCHEMA := .schema.yaml.tmp
+extra-y += $(DT_TMP_SCHEMA)
+
+quiet_cmd_mk_schema = SCHEMA  $@
+  cmd_mk_schema = mkdir -p $(obj); \
+  $(DT_MK_SCHEMA) -o $@ $(srctree)/$(src)
+
+DT_DOCS = $(shell cd $(srctree)/$(src) && find * -name '*.yaml')
+DTS_EXAMPLES = $(patsubst %.yaml,%.example.dts, $(DT_DOCS))
+extra-y += $(DTS_EXAMPLES)
+
+DTBS = $(patsubst %.yaml,%.example.dtb, $(DT_DOCS))
+extra-y += $(DTBS)
+
+$(obj)/$(DT_TMP_SCHEMA): $(addprefix $(obj)/, $(DTBS)) FORCE
+   $(call if_changed,mk_schema)
diff --git a/Makefile b/Makefile
index 021e274c4b03..648f7238e883 100644
--- a/Makefile
+++ b/Makefile
@@ -1227,10 +1227,13 @@ ifneq ($(dtstree),)
 %.dtb: prepare3 scripts_dtc
$(Q)$(MAKE) $(build)=$(dtstree) $(dtstree)/$@
 
-PHONY += dtbs dtbs_install
+PHONY += dtbs dtbs_install dt_binding_check
 dtbs: prepare3 scripts_dtc
$(Q)$(MAKE) $(build)=$(dtstree)
 
+dtbs_check: prepare3 dt_binding_check
+   $(Q)$(MAKE) $(build)=$(dtstree) CHECK_DTBS=1
+
 dtbs_install:
$(Q)$(MAKE) $(dtbinst)=$(dtstree)
 
@@ -1244,6 +1247,9 @@ PHONY += scripts_dtc
 scripts_dtc: scripts_basic
$(Q)$(MAKE) $(build)=scripts/dtc
 
+dt_binding_check: scripts_dtc
+   $(Q)$(MAKE) $(build)=Documentation/devicetree/bindings
+
 # ---
 # Modules
 
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 8fe4468f9bda..d1c5630ba24c 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -61,6 +61,11 @@ real-obj-m := $(foreach m, $(obj-m), $(if $(strip 
$($(m:.o=-objs)) $($(m:.o=-y))
 extra-y+= $(dtb-y)
 extra-$(CONFIG_OF_ALL_DTBS)+= $(dtb-)
 
+ifneq ($(CHECK_DTBS),)
+extra-y += $(patsubst %.dtb,%.dt.yaml, $(dtb-y))
+extra-$(CONFIG_OF_ALL_DTBS) += $(patsubst %.dtb,%.dt.yaml, $(dtb-))
+endif
+
 # Add subdir path
 
 extra-y:= $(addprefix $(obj)/,$(extra-y))
@@ -284,13 +289,28 @@ $(obj)/%.dtb.S: $(obj)/%.dtb FORCE
 quiet_cmd_dtc = DTC $@
 cmd_dtc = mkdir -p $(dir ${dtc-tmp}) ; \
$(HOSTCC) -E $(dtc_cpp_flags) -x assembler-with-cpp -o $(dtc-tmp) $< ; \
-   $(DTC) -O dtb -o $@ -b 0 \
+   $(DTC) -O $(2) -o $@ -b 0 \
$(addprefix -i,$(dir $<) $(DTC_INCLUDE)) $(DTC_FLAGS) \
-d $(depfil

Re: [PATCH security-next v4 23/32] selinux: Remove boot parameter

2018-10-05 Thread Kees Cook
On Thu, Oct 4, 2018 at 9:58 PM, James Morris  wrote:
> On Thu, 4 Oct 2018, Kees Cook wrote:
>
>> On Thu, Oct 4, 2018 at 10:49 AM, James Morris  wrote:
>> > On Wed, 3 Oct 2018, Kees Cook wrote:
>> >> Then someone boots the system with:
>> >>
>> >> selinux=1 security=selinux
>> >>
>> >> In what order does selinux get initialized relative to yama?
>> >> (apparmor, flagged as a "legacy major", would have been disabled by
>> >> the "security=" not matching it.)
>> >
>> > It doesn't, it needs to be specified in one place.
>> >
>> > Distros will need to update boot parameter handling for this kernel
>> > onwards.  Otherwise, we will need to carry this confusing mess forward
>> > forever.
>>
>> Are you saying that you want to overrule Paul and Stephen about
>> keeping "selinux=1 secuiryt=selinux" working?
>
> Not overrule, but convince.
>
> At least, deprecate selinux=1 and security=X, but not extend it any
> further.

Okay, this is the expectation from me as well. I think my series makes
it work as-is with the new stuff just fine.

>> > In my most recent suggestion, there is no '!' disablement, just
>> > enablement.  If an LSM is not listed in CONFIG_LSM="", it's not enabled.
>>
>> And a user would need to specify ALL lsms on the "lsm=" line?
>>
>
> Yes, the ones they want enabled.
>
>> What do you think of my latest proposal? It could happily work all
>> three ways: old boot params and security= work ("selinux=1
>> security=selinux" keeps working), individual LSM enable/disable works
>> ("lsm=+loadpin"), and full LSM ordering works
>> ("lsm=each,lsm,in,order,here"):
>>
>> https://lore.kernel.org/lkml/cagxu5jjjit8bdnvgxafkuvfpy7nwtjw2orwfbg-6iwk0+a1...@mail.gmail.com/
>>
>
> I think having something like +yama will still lead to confusion.
> Explicitly stating each enabled LSM in order is totally unambiguous.
>
> If people are moving away from the distro defaults, and there is no
> high-level interface to manage this, it seems to me there's a deeper
> issue with the distro.

Okay. I will adjust the series and send a v5.

-Kees

-- 
Kees Cook
Pixel Security


Re: [PATCH security-next v4 23/32] selinux: Remove boot parameter

2018-10-05 Thread James Morris
On Fri, 5 Oct 2018, James Morris wrote:

> On Thu, 4 Oct 2018, Kees Cook wrote:

> > And a user would need to specify ALL lsms on the "lsm=" line?
> > 
> 
> Yes, the ones they want enabled.

If they're overriding the kconfig value.

-- 
James Morris




Re: [RFC PATCH v4 3/9] x86/cet/ibt: Add IBT legacy code bitmap allocation function

2018-10-05 Thread Andy Lutomirski



> On Oct 5, 2018, at 9:13 AM, Yu-cheng Yu  wrote:
> 
>> On Wed, 2018-10-03 at 21:57 +0200, Eugene Syromiatnikov wrote:
>>> On Fri, Sep 21, 2018 at 08:05:47AM -0700, Yu-cheng Yu wrote:
>>> Indirect branch tracking provides an optional legacy code bitmap
>>> that indicates locations of non-IBT compatible code.  When set,
>>> each bit in the bitmap represents a page in the linear address is
>>> legacy code.
>>> 
>>> We allocate the bitmap only when the application requests it.
>>> Most applications do not need the bitmap.
>>> 
>>> Signed-off-by: Yu-cheng Yu 
>>> ---
>>> arch/x86/kernel/cet.c | 45 +++
>>> 1 file changed, 45 insertions(+)
>>> 
>>> diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c
>>> index 6adfe795d692..a65d9745af08 100644
>>> --- a/arch/x86/kernel/cet.c
>>> +++ b/arch/x86/kernel/cet.c
>>> @@ -314,3 +314,48 @@ void cet_disable_ibt(void)
>>>wrmsrl(MSR_IA32_U_CET, r);
>>>current->thread.cet.ibt_enabled = 0;
>>> }
>>> +
>>> +int cet_setup_ibt_bitmap(void)
>>> +{
>>> +u64 r;
>>> +unsigned long bitmap;
>>> +unsigned long size;
>>> +
>>> +if (!cpu_feature_enabled(X86_FEATURE_IBT))
>>> +return -EOPNOTSUPP;
>>> +
>>> +if (!current->thread.cet.ibt_bitmap_addr) {
>>> +/*
>>> + * Calculate size and put in thread header.
>>> + * may_expand_vm() needs this information.
>>> + */
>>> +size = TASK_SIZE / PAGE_SIZE / BITS_PER_BYTE;
>> 
>> TASK_SIZE_MAX is likely needed here, as an application can easily switch
>> between long an 32-bit protected mode.  And then the case of a CPU that
>> doesn't support 5LPT.
> 
> If we had calculated bitmap size from TASK_SIZE_MAX, all 32-bit apps would 
> have
> failed the allocation for bitmap size > TASK_SIZE.  Please see values below,
> which is printed from the current code.
> 
> Yu-cheng
> 
> 
> x64:
> TASK_SIZE_MAX=  7fff  f000
> TASK_SIZE=  7fff  f000
> bitmap size=    
> 
> x32:
> TASK_SIZE_MAX=  7fff  f000
> TASK_SIZE=    e000
> bitmap size=   0001 
> 

I haven’t followed all the details here, but I have a general policy of 
objecting to any new use of TASK_SIZE. If you really really need to depend on 
32-bitness in new code, please figure out what exactly you mean by “32-bit” and 
use an explicit check.

Some day I would love to delete TASK_SIZE.

Re: [RFC PATCH v4 3/9] x86/cet/ibt: Add IBT legacy code bitmap allocation function

2018-10-05 Thread Yu-cheng Yu
On Wed, 2018-10-03 at 21:57 +0200, Eugene Syromiatnikov wrote:
> On Fri, Sep 21, 2018 at 08:05:47AM -0700, Yu-cheng Yu wrote:
> > Indirect branch tracking provides an optional legacy code bitmap
> > that indicates locations of non-IBT compatible code.  When set,
> > each bit in the bitmap represents a page in the linear address is
> > legacy code.
> > 
> > We allocate the bitmap only when the application requests it.
> > Most applications do not need the bitmap.
> > 
> > Signed-off-by: Yu-cheng Yu 
> > ---
> >  arch/x86/kernel/cet.c | 45 +++
> >  1 file changed, 45 insertions(+)
> > 
> > diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c
> > index 6adfe795d692..a65d9745af08 100644
> > --- a/arch/x86/kernel/cet.c
> > +++ b/arch/x86/kernel/cet.c
> > @@ -314,3 +314,48 @@ void cet_disable_ibt(void)
> > wrmsrl(MSR_IA32_U_CET, r);
> > current->thread.cet.ibt_enabled = 0;
> >  }
> > +
> > +int cet_setup_ibt_bitmap(void)
> > +{
> > +   u64 r;
> > +   unsigned long bitmap;
> > +   unsigned long size;
> > +
> > +   if (!cpu_feature_enabled(X86_FEATURE_IBT))
> > +   return -EOPNOTSUPP;
> > +
> > +   if (!current->thread.cet.ibt_bitmap_addr) {
> > +   /*
> > +* Calculate size and put in thread header.
> > +* may_expand_vm() needs this information.
> > +*/
> > +   size = TASK_SIZE / PAGE_SIZE / BITS_PER_BYTE;
> 
> TASK_SIZE_MAX is likely needed here, as an application can easily switch
> between long an 32-bit protected mode.  And then the case of a CPU that
> doesn't support 5LPT.

If we had calculated bitmap size from TASK_SIZE_MAX, all 32-bit apps would have
failed the allocation for bitmap size > TASK_SIZE.  Please see values below,
which is printed from the current code.

Yu-cheng


x64:
TASK_SIZE_MAX   =  7fff  f000
TASK_SIZE   =  7fff  f000
bitmap size =    

x32:
TASK_SIZE_MAX   =  7fff  f000
TASK_SIZE   =    e000
bitmap size =   0001 



Re: [PATCH 0/2] ext4: even more documentation fixes

2018-10-05 Thread Randy Dunlap
On 10/5/18 8:22 AM, Theodore Y. Ts'o wrote:
> On Thu, Oct 04, 2018 at 07:48:31PM -0700, Randy Dunlap wrote:
>> Hi Darrick,
>>
>> I don't see patch 2/2 anywhere (my inbox, email archives)...
> 
> Probably because it's moving a lot of files around, so the diffs were 276k.

Oh, yeah.  Thanks.

-- 
~Randy


Re: [PATCH 0/2] ext4: even more documentation fixes

2018-10-05 Thread Theodore Y. Ts'o
On Thu, Oct 04, 2018 at 07:48:31PM -0700, Randy Dunlap wrote:
> Hi Darrick,
> 
> I don't see patch 2/2 anywhere (my inbox, email archives)...

Probably because it's moving a lot of files around, so the diffs were 276k.



> 
> -- 
> ~Randy


Re: [PATCH v4 0/3] fs/dcache: Track # of negative dentries

2018-10-05 Thread Waiman Long
On 09/12/2018 01:35 PM, Waiman Long wrote:
>  v3->v4:
>   - Drop patch 4 as it is just a minor optimization.
>   - Add a cc:stable tag to patch 1.
>   - Clean up some comments in patch 3.
>
>  v2->v3:
>   - With confirmation that the dummy array in dentry_stat structure
> was never a replacement of a previously used field, patch 3 is now
> reverted back to use one of dummy field as the negative dentry count
> instead of adding a new field.
>
>  v1->v2:
>   - Clarify what the new nr_dentry_negative per-cpu counter is tracking
> and open-code the increment and decrement as suggested by Dave Chinner.
>   - Append the new nr_dentry_negative count as the 7th element of dentry-state
> instead of replacing one of the dummy entries.
>   - Remove patch "fs/dcache: Make negative dentries easier to be
> reclaimed" for now as I need more time to think about what
> to do with it.
>   - Add 2 more patches to address issues found while reviewing the
> dentry code.
>   - Add another patch to change the conditional branch of
> nr_dentry_negative accounting to conditional move so as to reduce
> the performance impact of the accounting code.
>
> This patchset addresses 2 issues found in the dentry code and adds a
> new nr_dentry_negative per-cpu counter to track the total number of
> negative dentries in all the LRU lists.
>
> Patch 1 fixes a bug in the accounting of nr_dentry_unused in
> shrink_dcache_sb().
>
> Patch 2 removes the cacheline_aligned_in_smp tag from super_block
> LRU lists.
>
> Patch 3 adds the new nr_dentry_negative per-cpu counter.
>
> Various filesystem related tests were run and no statistically
> significant changes in performance outside of the possible noise range
> was observed.
>
> Waiman Long (3):
>   fs/dcache: Fix incorrect nr_dentry_unused accounting in
> shrink_dcache_sb()
>   fs: Don't need to put list_lru into its own cacheline
>   fs/dcache: Track & report number of negative dentries
>
>  Documentation/sysctl/fs.txt | 26 --
>  fs/dcache.c | 38 +-
>  include/linux/dcache.h  |  7 ---
>  include/linux/fs.h  |  9 +
>  4 files changed, 58 insertions(+), 22 deletions(-)
>
Any comments on these patches. The first one actually is a bug fix.

Cheers,
Longman



[PATCH] Documentation/arm64: HugeTLB page implementation

2018-10-05 Thread Punit Agrawal
Arm v8 architecture supports multiple page sizes - 4k, 16k and
64k. Based on the active page size, the Linux port supports
corresponding hugepage sizes at PMD and PUD(4k only) levels.

In addition, the architecture also supports caching larger sized
ranges (composed of multiple entries) at the PTE and PMD level in the
TLBs using the contiguous bit. The Linux port makes use of this
architectural support to enable additional hugepage sizes.

Describe the two different types of hugepages supported by the arm64
kernel and the hugepage sizes enabled by each.

Signed-off-by: Punit Agrawal 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Jonathan Corbet 
---
 Documentation/arm64/hugetlbpage.txt | 39 +
 1 file changed, 39 insertions(+)
 create mode 100644 Documentation/arm64/hugetlbpage.txt

diff --git a/Documentation/arm64/hugetlbpage.txt 
b/Documentation/arm64/hugetlbpage.txt
new file mode 100644
index ..64ee24b88d27
--- /dev/null
+++ b/Documentation/arm64/hugetlbpage.txt
@@ -0,0 +1,39 @@
+HugeTLBpage on ARM64
+
+
+Hugepage relies on making efficient use of TLBs to improve performance of
+address translations. The benefit depends on both -
+
+  - the size of hugepages
+  - size of entries supported by the TLBs
+
+The ARM64 port supports two flavours of hugepages.
+
+1) Block mappings at the pud/pmd level
+--
+
+These are regular hugepages where a pmd or a pud page table entry points to a
+block of memory. Regardless of the supported size of entries in TLB, block
+mappings reduces the depth of page table walk needed to translate hugepage
+addresses.
+
+2) Using the Contiguous bit
+---
+
+The architecture provides a contiguous bit in the translation table entries
+(D4.5.3, ARM DDI 0487C.a) that hints to the mmu to indicate that it is one of a
+contiguous set of entries that can be cached in a single TLB entry.
+
+The contiguous bit is used in Linux to increase the mapping size at the pmd and
+pte (last) level. The number of supported contiguous entries vary by page size
+and level of the page table.
+
+
+
+The following hugepage sizes are supported -
+
+ CONT PTEPMDCONT PMDPUD
+ ------
+  4K: 64K 2M 32M 1G
+  16K: 2M32M  1G
+  64K: 2M   512M 16G
-- 
2.18.0



[GIT PULL linux-next] Add Compiler Attributes (v6) tree

2018-10-05 Thread Miguel Ojeda
Hi Stephen,

As discussed, here it is the Compiler Attributes series for
linux-next. This time the original v6, based on -rc6.

The changes w.r.t. v5:

  - Added latest Reviewed-by's and Tested-by's.

The conflicts are trivial to solve, but if you want a reference, take
a look at (rebased on top of next-20181005):

  https://github.com/ojeda/linux.git compiler-attributes-rebased

Thanks!

Cheers,
Miguel

The following changes since commit 17b57b1883c1285f3d0dc2266e8f79286a7bef38:

  Linux 4.19-rc6 (2018-09-30 07:15:35 -0700)

are available in the Git repository at:

  https://github.com/ojeda/linux.git compiler-attributes

for you to fetch changes up to f0604f63033d4020f019d2aaee805c1075b1077b:

  Compiler Attributes: ext4: remove local __nonstring definition
(2018-09-30 20:14:04 +0200)


Miguel Ojeda (15):
  Compiler Attributes: remove unused attributes
  Compiler Attributes: always use the extra-underscores syntax
  Compiler Attributes: remove unneeded tests
  Compiler Attributes: homogenize __must_be_array
  Compiler Attributes: remove unneeded sparse (__CHECKER__) tests
  Compiler Attributes: add missing SPDX ID in compiler_types.h
  Compiler Attributes: use feature checks instead of version checks
  Compiler Attributes: KENTRY used twice the "used" attribute
  Compiler Attributes: remove uses of __attribute__ from compiler.h
  Compiler Attributes: add Doc/process/programming-language.rst
  Compiler Attributes: add MAINTAINERS entry
  Compiler Attributes: add support for __nonstring (gcc >= 8)
  Compiler Attributes: enable -Wstringop-truncation on W=1 (gcc >= 8)
  Compiler Attributes: auxdisplay: panel: use __nonstring
  Compiler Attributes: ext4: remove local __nonstring definition

 Documentation/process/index.rst|   1 +
 Documentation/process/programming-language.rst |  45 +
 MAINTAINERS|   5 +
 drivers/auxdisplay/panel.c |   7 +-
 fs/ext4/ext4.h |   9 -
 include/linux/compiler-clang.h |   5 -
 include/linux/compiler-gcc.h   |  70 +--
 include/linux/compiler-intel.h |   9 -
 include/linux/compiler.h   |  19 +-
 include/linux/compiler_attributes.h| 258 +
 include/linux/compiler_types.h | 101 ++
 scripts/Makefile.extrawarn |   1 +
 12 files changed, 341 insertions(+), 189 deletions(-)
 create mode 100644 Documentation/process/programming-language.rst
 create mode 100644 include/linux/compiler_attributes.h


Re: [PATCH v3 2/6] mm/memory_hotplug: make add_memory() take the device_hotplug_lock

2018-10-05 Thread Oscar Salvador
On Thu, Sep 27, 2018 at 11:25:50AM +0200, David Hildenbrand wrote:
> Reviewed-by: Pavel Tatashin 
> Reviewed-by: Rafael J. Wysocki 
> Reviewed-by: Rashmica Gupta 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Oscar Salvador 

-- 
Oscar Salvador
SUSE L3


Re: [PATCH v3 1/6] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock

2018-10-05 Thread Oscar Salvador
On Thu, Sep 27, 2018 at 11:25:49AM +0200, David Hildenbrand wrote:
> Reviewed-by: Pavel Tatashin 
> Reviewed-by: Rafael J. Wysocki 
> Reviewed-by: Rashmica Gupta 
> Signed-off-by: David Hildenbrand 
 
Reviewed-by: Oscar Salvador 

-- 
Oscar Salvador
SUSE L3


Re: [PATCH v3 6/6] memory-hotplug.txt: Add some details about locking internals

2018-10-05 Thread Oscar Salvador
On Thu, Sep 27, 2018 at 11:25:54AM +0200, David Hildenbrand wrote:
> Cc: Jonathan Corbet 
> Cc: Michal Hocko 
> Cc: Andrew Morton 
> Reviewed-by: Pavel Tatashin 
> Reviewed-by: Rashmica Gupta 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Oscar Salvador 

-- 
Oscar Salvador
SUSE L3


Re: [PATCH v3 3/6] mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock

2018-10-05 Thread Oscar Salvador
On Thu, Sep 27, 2018 at 11:25:51AM +0200, David Hildenbrand wrote:
> Reviewed-by: Pavel Tatashin 
> Reviewed-by: Rashmica Gupta 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Oscar Salvador 
-- 
Oscar Salvador
SUSE L3