from:"Josh Triplett"

Re: [PATCH 00/13] [RFC] Rust support

2021-04-16 Thread Josh Triplett

On Fri, Apr 16, 2021 at 12:27:39PM +0800, Boqun Feng wrote:
> Josh, I think it's good if we can connect to the people working on Rust
> memoryg model, I think the right person is Ralf Jung and the right place
> is https://github.com/rust-lang/unsafe-code-guidelines, but you
> cerntainly know better than me ;-) Or maybe we can use Rust-for-Linux or
> linux-toolchains list to discuss.

Ralf is definitely the right person to talk to. I don't think the UCG
repository is the right place to start that discussion, though. For now,
I'd suggest starting an email thread with Ralf and some C-and-kernel
memory model folks (hi Paul!) to suss out the most important changes
that would be needed.

With my language team hat on, I'd *absolutely* like to see the Rust
memory model support RCU-style deferred reclamation in a sound way,
ideally with as little unsafe code as possible.

Re: [PATCH 00/13] [RFC] Rust support

2021-04-14 Thread Josh Triplett

On Wed, Apr 14, 2021 at 01:21:52PM -0700, Linus Torvalds wrote:
> On Wed, Apr 14, 2021 at 1:10 PM Matthew Wilcox  wrote:
> >
> > There's a philosophical point to be discussed here which you're skating
> > right over!  Should rust-in-the-linux-kernel provide the same memory
> > allocation APIs as the rust-standard-library, or should it provide a Rusty
> > API to the standard-linux-memory-allocation APIs?
> 
> Yeah, I think that the standard Rust API may simply not be acceptable
> inside the kernel, if it has similar behavior to the (completely
> broken) C++ "new" operator.
> 
> So anything that does "panic!" in the normal Rust API model needs to
> be (statically) caught, and never exposed as an actual call to
> "panic()/BUG()" in the kernel.

Rust has both kinds of allocation APIs: you can call a method like
`Box::new` that panics on allocation failure, or a method like
`Box::try_new` that returns an error on allocation failure.

With some additional infrastructure that's still in progress, we could
just not supply the former kind of methods at all, and *only* supply the
latter, so that you're forced to handle allocation failure. That just
requires introducing some further ability to customize the Rust standard
library.

(There are some cases of methods in the standard library that don't have
a `try_` equivalent, but we could fix that. Right now, for instance,
there isn't a `try_` equivalent of every Vec method, and you're instead
expected to call `try_reserve` to make sure you have enough memory
first; however, that could potentially be changed.)

Re: A note on the 5.12-rc1 tag

2021-03-05 Thread Josh Triplett

On Fri, Mar 05, 2021 at 10:10:05AM -0800, Junio C Hamano wrote:
> Christian Couder  writes:
> 
> >> (git notes would be nice for this, but they're hard to share reliably;
> >> the above mechanism to accumulate entries from a file in the repo seems
> >> simpler. I can imagine other possibilities.)
> >
> > If the notes are created automatically from the `/.git-bisect-skip`
> > files and stored in `refs/notes/skip`, then they might not need to be
> > shared. If people already share notes, they would just need to stop
> > sharing those in `refs/notes/skip`.
> 
> Ehh, doesn't Josh _want_ to share them, though?  I do not know if a
> single "refs/notes/bisect-skip" notes would do, or you'd need one
> notes tree per the kind of bisection (iow, people may be chasing
> different set of bugs, and the commits that need to be skipped while
> chasing one bug may be OK to test while chasing another one), but I
> would imagine that for this particular use case of marking "these
> commits are dangerous to check out and build on", it does not depend
> on what you are bisecting to find at all, so sharing would be a
> sensible thing to do.
> 
> It is trivial for you to fetch the refs/notes/do-not--checkout notes
> tree from me and merge it into your refs/notes/do-not--checkout
> notes tree, I would think; "git notes merge" may have room for
> improvement, but essentially it would just want a union of two
> sets, no?

My primary concern about notes is that they require manual
action/configuration in order to share. I was looking for a solution
that would automatically protect anyone who pulled from linux.git
(directly or indirectly), without them having to specifically take a
separate step to sync this information.

Re: A note on the 5.12-rc1 tag

2021-03-04 Thread Josh Triplett

[CCing the git list]

On Wed, Mar 03, 2021 at 12:53:18PM -0800, Linus Torvalds wrote:
> Hey peeps - some of you may have already noticed that in my public git
> tree, the "v5.12-rc1" tag has magically been renamed to
> "v5.12-rc1-dontuse". It's still the same object, it still says
> "v5.12-rc1" internally, and it is still is signed by me, but the
> user-visible name of the tag has changed.
> 
> The reason is fairly straightforward: this merge window, we had a very
> innocuous code cleanup and simplification that raised no red flags at
> all, but had a subtle and very nasty bug in it: swap files stopped
> working right.  And they stopped working in a particularly bad way:
> the offset of the start of the swap file was lost.
> 
> Swapping still happened, but it happened to the wrong part of the
> filesystem, with the obvious catastrophic end results.
[...]
> One additional reason for this note is that I want to not just warn
> people to not run this if you have a swapfile - even if you are
> personally not impacted (like I am, and probably most people are -
> swap partitions all around) - I want to make sure that nobody starts
> new topic branches using that 5.12-rc1 tag. I know a few developers
> tend to go "Ok, rc1 is out, I got all my development work into this
> merge window, I will now fast-forward to rc1 and use that as a base
> for the next release". Don't do it this time. It may work perfectly
> well for you because you have the common partition setup, but it can
> end up being a horrible base for anybody else that might end up
> bisecting into that area.

Even if people avoid basing their topic branches on 5.12-rc1, it's still
possible for a future bisect to end up wandering to one of the existing
dangerous commits, if someone's trying to find a historical bug and git
happens to choose that as a halfway point. And if they happen to be
using a swap file, they could end up with serious data loss, years from
now when "5.12-rc1 is broken" isn't on the top of their mind or even
something they heard about originally.

Would it make sense to add a feature to git that allows defining a
"dangerous" region for bisect? Rough sketch:
- Add a `/.git-bisect-dangerous` file to the repository, containing a
  list of of commit range expressions (contains commit X, doesn't
  contain commit Y) and associated messages ("Do not use these kernels
  if you have a swap file; if you need to bisect into here, disable swap
  files first").
- git-bisect, as it navigates commits, always checks that file for any
  commit it processes, and adds any new entries it sees into
  `.git/bisect-dangerous`; it never removes entries from there.
- git-bisect avoids choosing bisection points anywhere in that range
  until it absolutely has to (because it's narrowed an issue to that
  range). This can use something similar to the existing `git bisect
  skip` machinery. Manual bisections print the message at that point.
  Automated bisections (`git bisect run`) stop and print the range
  without narrowing further, unless the user passes something like
  `--dangerous-ok=commit-range`.

(git notes would be nice for this, but they're hard to share reliably;
the above mechanism to accumulate entries from a file in the repo seems
simpler. I can imagine other possibilities.)

Does something like this seem potentially reasonable, and worth doing to
help people avoid future catastrophic data loss?


- Josh Triplett

[PATCH] nbd: Respect max_part for all partition scans

2020-12-17 Thread Josh Triplett

The creation path of the NBD device respects max_part and only scans for
partitions if max_part is not 0. However, some other code paths ignore
max_part, and unconditionally scan for partitions. Add a check for
max_part on each partition scan.

Signed-off-by: Josh Triplett 
---

Caught this when reading recent NBD patches, which tweaked the
nbd_set_size path.

It doesn't seem worth factoring these two lines into a function,
especially since the callers obtain the disk structure in different
ways.

 drivers/block/nbd.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 92f84ed0ba9e..6727358e147d 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -318,7 +318,8 @@ static int nbd_set_size(struct nbd_device *nbd, loff_t 
bytesize,
blk_queue_logical_block_size(nbd->disk->queue, blksize);
blk_queue_physical_block_size(nbd->disk->queue, blksize);
 
-   set_bit(GD_NEED_PART_SCAN, >disk->state);
+   if (max_part)
+   set_bit(GD_NEED_PART_SCAN, >disk->state);
if (!set_capacity_and_notify(nbd->disk, bytesize >> 9))
kobject_uevent(_to_dev(nbd)->kobj, KOBJ_CHANGE);
return 0;
@@ -1476,9 +1477,11 @@ static int nbd_open(struct block_device *bdev, fmode_t 
mode)
refcount_set(>config_refs, 1);
refcount_inc(>refs);
mutex_unlock(>config_lock);
-   set_bit(GD_NEED_PART_SCAN, >bd_disk->state);
+   if (max_part)
+   set_bit(GD_NEED_PART_SCAN, >bd_disk->state);
} else if (nbd_disconnected(nbd->config)) {
-   set_bit(GD_NEED_PART_SCAN, >bd_disk->state);
+   if (max_part)
+   set_bit(GD_NEED_PART_SCAN, >bd_disk->state);
}
 out:
mutex_unlock(_index_mutex);
-- 
2.29.2

Re: LPC 2020 Hackroom Session: summary and next steps for isolated user namespaces

2020-10-11 Thread Josh Triplett

On Fri, Oct 09, 2020 at 11:26:06PM -0500, Serge E. Hallyn wrote:
> > 3. Find a way to allow setgroups() in a user namespace while keeping
> >in mind the case of groups used for negative access control.
> >This was suggested by Josh Triplett and Geoffrey Thomas. Their idea was 
> > to
> >investigate adding a prctl() to allow setgroups() to be called in a user
> >namespace at the cost of restricting paths to the most restrictive
> >permission. So if something is 0707 it needs to be treated as if it's 
> > 
> >even though the caller is not in its owning group which is used for 
> > negative
> >access control (how these new semantics will interact with ACLs will also
> >need to be looked into).
> 
> I should probably think this through more, but for this problem, would it
> not suffice to add a new prevgroups grouplist to the struct cred, maybe
> struct group_info *locked_groups, and every time an unprivileged task creates
> a new user namespace, add all its current groups to this list?

So, effectively, you would be allowed to drop permissions, but
locked_groups would still be checked for restrictions?

That seems like it'd introduce a new level of complexity (a new facet of
permission) to manage. Not opposed, but it does seem more complex than
just opting out of using groups for negative permissions.

Re: ext4 regression in v5.9-rc2 from e7bfb5c9bb3d on ro fs with overlapped bitmaps

2020-10-09 Thread Josh Triplett

On Fri, Oct 09, 2020 at 10:37:32AM -0400, Theodore Y. Ts'o wrote:
> That being said, on the ext4 weekly video chat, we did discuss other
> uses of an incompat feature flag that would allow the allocation
> bitmap blocks and inode table block fields in the superblock to be
> zero, which would mean that they are unallocated.

What do you mean by "allocation bitmap blocks and inode table block
fields in the superblock"? Those are in the group descriptor, not the
superblock. Or am I missing something there?

> This would allow us
> to dynamically grow the inode table by adding an extra block group
> descriptor.  In fact, I'd probably use this as an opportunity to make
> some other changes, such using inodes to store locations of the block
> group descriptors, inode tables, and allocation bitmaps at the same
> time.  Those details can be discussed later, but the point is that
> this is why it's good to discuss format changes from a requirements
> perspective, so that if we do need to make an incompat change, we can
> kill multiple birds with a single stone.

I would be quite interested in that.

> On Thu, Oct 08, 2020 at 03:22:59PM -0700, Josh Triplett wrote:
> > It's an arbitrary filesystem hierarchy, including directories, files of
> > various sizes (hence using inline_data), and permissions. The problem
> > isn't to get data from point A to point B; the problem is (in part) to
> > turn a representation of a filesystem into an actual mounted filesystem
> > as efficiently as possible, live-serving individual blocks on demand
> > rather than generating the whole image in advance.
> 
> Ah, so you want to be able to let the other side "look at" the file
> system in parallel with it being generated on demand?  The cache
> coherency problems would seem to be... huge.  For example, how can you
> add a file to directory after the reader has looked at the directory
> inode and directory blocks?

I don't. While the data is computed on demand for performance reasons,
the nature and size of all the data is fully known in advance. I never
add data to a filesystem, only create new filesystem images. The kernel
*is* looking at the filesystem in parallel with it being generated,
insofar as blocks aren't constructed until the kernel asks for them
(which is a massive performance win, especially since the kernel may
only want a small subset of the filesystem). But the block contents are
fixed in advance, even if they haven't been generated yet. So the kernel
can read ahead and cache any blocks it wants to, and they'll be valid.
(Excessive readahead might be a performance problem, but not a
correctness one.)

I briefly considered ideas around adding new data after the filesystem
was mounted, and dismissed those ideas just as quickly, for exactly
these reasons (disk caching, filesystem caching, readahead). That would
require a filesystem with at least some subset of cluster features. I
don't plan to go there if I don't have to.

If I do end up needing that, I'd consider proposing an ext4 change along
the lines of making the root directory into a "super-root" directory
under which multiple filesystem roots could live, and supporting a way
to re-read that root to discover new filesystem roots and new block
groups; then, it'd be possible to add a new root whose contents are
*mostly* references to existing inodes (that the kernel would already
have cached), and any modified directories or new/modified files would
have new inodes added. That would isolate the "don't read ahead and
cache" problem to the new inodes, which could potentially be isolated to
new block groups for simplicity, and the "discover new roots" mechanism
could also discover the newly added block groups and inode tables.

But again, I'm not curently looking to do that, and *if* I were, I'd
bring it up for architectural discussion and design on linux-ext4 first.
There's also a balance there between a simple version that'd work for an
append-only read-only filesystem, and a complex version that'd work for
a writable filesystem, and I'd hope more for the former than the latter,
but that'd be a topic for discussion.

- Josh Triplett

Re: ext4 regression in v5.9-rc2 from e7bfb5c9bb3d on ro fs with overlapped bitmaps

2020-10-09 Thread Josh Triplett

On Thu, Oct 08, 2020 at 07:54:09PM -0700, Darrick J. Wong wrote:
> On Thu, Oct 08, 2020 at 03:38:58PM -0700, Josh Triplett wrote:
> > On Thu, Oct 08, 2020 at 10:54:48AM -0700, Darrick J. Wong wrote:
> > > > > the head", and continued with the notion that anything other than
> > > > > e2fsprogs making something to be mounted by mount(2) and handled by
> > > > > fs/ext4 is being "inflicted", and if the goal didn't still seem to be
> > > > > "how do we make it go away so that only e2fsprogs and the kernel ever
> > > > > touch ext4". I started this thread because I'd written some userspace
> > > > > code, a new version of the kernel made that userspace code stop 
> > > > > working,
> > > > > so I wanted to report that the moment I'd discovered that, along with 
> > > > > a
> > > > > potential way to address it with as little disrupton to ext4 as
> > > > > possible.
> > > 
> > > Ted: I don't think it's a good idea to make SHARED_BLOCKS disable block
> > > validity checking by default.  You could someday enable users to write
> > > to block-sharing files by teaching ext4 how to COW, at which point you'd
> > > need correct bitmaps and block validity to prevent accidental overwrite
> > > of critical metadata.  At that point you'd either be stuck with the
> > > precedent that SHARED_BLOCKS implies noblock_validity, or you'd end up
> > > breaking Josh's images again.
> > 
> > That's a fair point (though I think a writable COW version of
> > SHARED_BLOCKS would need a different flag). It'd make more sense to key
> > this logic off of RO_COMPAT_READONLY or similar. But even better:
> 
> It wouldn't require a new flag -- "rocompat" features bits mean that
> "it's safe to allow userspace to read files off the disk if software
> doesn't recognize this feature bit".

Sure; I was more thinking that if that involved adding some data
structures to track shared blocks and the need to COW, whatever
mechanism that used might potentially need an incompat flag.

> If someone /did/ add the code to write to files safely in the presence
> of shared blocks, all they'd have to do to light up the functionality is
> add it to the _SUPP define.  Also, it's strange that the kernel source
> doesn't even know of SHARED_BLOCKS, but that's also on Ted...

It would be nice if the kernel's ext4.h header and e2fsprogs were fully
in sync, yes.

- Josh Triplett

Re: ext4 regression in v5.9-rc2 from e7bfb5c9bb3d on ro fs with overlapped bitmaps

2020-10-08 Thread Josh Triplett

ccidental overwrite
> of critical metadata.  At that point you'd either be stuck with the
> precedent that SHARED_BLOCKS implies noblock_validity, or you'd end up
> breaking Josh's images again.

That's a fair point (though I think a writable COW version of
SHARED_BLOCKS would need a different flag). It'd make more sense to key
this logic off of RO_COMPAT_READONLY or similar. But even better:

> "noblock_validity" in the superblock mount options field of the images
> you create.

Yeah, I can do that. That's a much better solution, thank you. It would
have been problematic to have to change the userspace that mounts the
filesystem to pass new mount options ("noblock_validity") for the new
kernel. But if I can embed it in the filesystem, that'll work.

I'll do that, and please feel free to drop the original proposed patch
as it's no longer needed. Thanks, Darrick.

- Josh Triplett

Re: ext4 regression in v5.9-rc2 from e7bfb5c9bb3d on ro fs with overlapped bitmaps

2020-10-08 Thread Josh Triplett

On Thu, Oct 08, 2020 at 01:25:57PM -0600, Andreas Dilger wrote:
> On Oct 8, 2020, at 1:12 PM, Josh Triplett  wrote:
> > On Wed, Oct 07, 2020 at 08:57:12PM -0600, Andreas Dilger wrote:
> >> I *do* think that inline_data is an under-appreciated feature that I
> >> would be happy to see some improvements with.  I don't think that small
> >> files are a niche use case, and if we can clean up the inline_data code
> >> to work with 128-byte inodes I'm not against that, even though I'm not
> >> going to use that combination of features myself.
> > 
> > I'd love to see that happen. At the time, it seemed like too large of a
> > change to block on, which is why I ended up deciding to switch to
> > 256-byte inodes.
> 
> Does that mean you are using inline_data with 256-byte inodes?

I am, yes, and it mostly works great. I've hit zero issues with it in
the filesystems I'm generating.

> That would also be good to know, since there haven't been any
> well-known users of this feature so far (AFAIK).  Since you are using
> this in a read-only manner, you won't hit the one know issue when an
> inline_data inode is extended to use an external block that may
> temporarily leave the inode in an inconsistent state.

I've run into a few other issues with it in other tools, as well. mke2fs
with inline_data generates invalid files given xattrs:
https://lore.kernel.org/linux-ext4/20200926102512.GA11386@localhost/T/#u

And extlinux doesn't like inline_data at all:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=971002

I'll report any other issues I run into using inline_data. I agree that
it's deeply underappreciated.

- Josh

Re: ext4 regression in v5.9-rc2 from e7bfb5c9bb3d on ro fs with overlapped bitmaps

2020-10-08 Thread Josh Triplett

On Wed, Oct 07, 2020 at 10:10:17PM -0400, Theodore Y. Ts'o wrote:
> On Wed, Oct 07, 2020 at 01:14:24PM -0700, Josh Triplett wrote:
> > That sounds like a conversation that would have been a lot more
> > interesting and enjoyable if it hadn't started with "can we shoot it in
> > the head", and continued with the notion that anything other than
> > e2fsprogs making something to be mounted by mount(2) and handled by
> > fs/ext4 is being "inflicted", and if the goal didn't still seem to be
> > "how do we make it go away so that only e2fsprogs and the kernel ever
> > touch ext4". I started this thread because I'd written some userspace
> > code, a new version of the kernel made that userspace code stop working,
> > so I wanted to report that the moment I'd discovered that, along with a
> > potential way to address it with as little disrupton to ext4 as
> > possible.
> 
> What is really getting my dander up is your attempt to claim that the
> on-disk file system format is like the userspace/kernel interface,
> where if we break any file system that file system that was
> "previously accepted by an older kernel", this is a bug that must be
> reverted or otherwise fixed to allow file systems that had previously
> worked, to continue to work.  And this is true even if the file system
> is ***invalid***.
> 
> And the problem with this is that there have been any number of
> commits where file systems which were previously invalid, but which
> could be caused to trigger a syzbot whine, which was fixed by
> tightening up the validity tests in the kernel.  In some cases, I had
> to also had to fix up e2fsck to detect the invalid file system which
> was generated by the file system fuzzer.  Yes, it's unfortunate that
> we didn't have these checks earlier, but a file system has a huge
> amount of state.
> 
> The principle you've articulated would make it impossible for me to
> fix these bugs, unless I can prove that the failure to check a
> particular invalid file system corruption could lead to a security
> vulnerability.  (Would it be OK for me to make the kernel more strict
> and reject an invalid file system if it triggers a WARN_ON, so I get
> the syzbot complaint, but it doesn't actually cause a security issue?)
> 
> So this conversation would have been a lot more pleasant for *me* if
> you hadn't tried to elevate your request to a general principle, where
> if someone is deliberately generating an invalid file system, I'm not
> allowed to make the kernel more strict to detect said invalidity and
> to reject the invalid / corrupted / fuzzed file system.

I appreciate you providing this explanation; this makes the nature and
severity of your concern *much* more clear. I also now understand why I
was feeling confused, and why it felt several times like we were talking
past each other; see below. For my part, I had no desire to have
generated this level of concern. The restrictions you're describing
above are far, far past anything I had possibly envisioned or desired. I
want to clarify a couple of things.

I wasn't trying to make a *new* general principle or policy. I was under
the impression that this *was* the policy, because it never occurred to
me that it could be otherwise. It seemed like a natural aspect of the
kernel/userspace boundary, to the point that the idea of it *not* being
part of the kernel's stability guarantees didn't cross my mind. Thus,
when you were suggesting a policy of "only filesystems generated by the
e2fsprogs userspace tools are acceptable", that seems like you were
suggesting a massive loosening of existing policy. And when I suggested
an alternative of "only filesystems that pass e2fsck are acceptable", I
was also trying to suggest a loosening of the existing policy, just a
different, milder one. I would have come to this discussion with a very
different perspective if it had occurred to me that this might not be
the policy, or at least that it simply hadn't come up in this way
before.

I would be quite happy to have a discussion about where that stability
boundary should be. I also have no desire to be inflexible about that
boundary or about the evolution of the software I'm working on; this is
not some enterprise "thou shalt not touch" workload. On the contrary,
within reason it's *relatively* straightforward for me to evolve
filesystem generation code to deal with changes in the kernel or e2fsck.
I was not looking to push some new general principle on stability; I
only wished to ensure that it was reasonable to negotiate about the best
way to solve problems if and when they arise, and I hope they do not
arise often.  (Most of the time, I'd expect that when they arise I'll
end up wanting to change my software rather than the kernel.)

I also don't think th

Re: ext4 regression in v5.9-rc2 from e7bfb5c9bb3d on ro fs with overlapped bitmaps

2020-10-08 Thread Josh Triplett

On Wed, Oct 07, 2020 at 08:57:12PM -0600, Andreas Dilger wrote:
> On Oct 7, 2020, at 2:14 PM, Josh Triplett  wrote:
> > If those aren't the right way to express that, I could potentially
> > adapt. I had a similar such conversation on linux-ext4 already (about
> > inline data with 128-bit inodes), which led to me choosing to abandon
> > 128-byte inodes rather than try to get ext4 to support what I wanted
> > with them, because I didn't want to be disruptive to ext4 for a niche
> > use case. In the particular case that motivated this thread, what I was
> > doing already worked in previous kernels, and it seemed reasonable to
> > ask for it to continue to work in new kernels, while preserving the
> > newly added checks in the new kernels.
> 
> This was discussed in the "Inline data with 128-byte inodes?" thread
> back in May.  While Jan was not necessarily in favour of this, I was
> actually OK with improving the ext4 code to handle this case better,
> since it would (at minimum) clean up ext4 to make a clear separation
> of how it is detecting data in the i_block[] array and the system.data
> xattr, and I don't think it added any complexity to the code.
> 
> I even posted a WIP patch to that effect, but didn't get a response back:
> https://marc.info/?l=linux-ext4=158863275019187

My apologies, I thought I responded to that. It looks promising to me,
though I wouldn't have the bandwidth to take it to completion anytime
soon.

> I *do* think that inline_data is an under-appreciated feature that I
> would be happy to see some improvements with.  I don't think that small
> files are a niche use case, and if we can clean up the inline_data code
> to work with 128-byte inodes I'm not against that, even though I'm not
> going to use that combination of features myself.

I'd love to see that happen. At the time, it seemed like too large of a
change to block on, which is why I ended up deciding to switch to
256-byte inodes.

Re: ext4 regression in v5.9-rc2 from e7bfb5c9bb3d on ro fs with overlapped bitmaps

2020-10-07 Thread Josh Triplett

On Wed, Oct 07, 2020 at 10:32:11AM -0400, Theodore Y. Ts'o wrote:
> On Wed, Oct 07, 2020 at 01:03:04AM -0700, Josh Triplett wrote:
> > > But can we *please* take your custom tool out back and shoot it in the
> > > head?
> > 
> > Nope. As mentioned, this isn't about creating ext4 filesystem images,
> > and it isn't even remotely similar to mke2fs.
> 
> Can you please tell us what your tool is for, then?  Why are you doing
> this?  Why are you inflicting this on us?

That sounds like a conversation that would have been a lot more
interesting and enjoyable if it hadn't started with "can we shoot it in
the head", and continued with the notion that anything other than
e2fsprogs making something to be mounted by mount(2) and handled by
fs/ext4 is being "inflicted", and if the goal didn't still seem to be
"how do we make it go away so that only e2fsprogs and the kernel ever
touch ext4". I started this thread because I'd written some userspace
code, a new version of the kernel made that userspace code stop working,
so I wanted to report that the moment I'd discovered that, along with a
potential way to address it with as little disruption to ext4 as
possible.

I'm not looking to be an alternative userspace, or an alternative
anything; that implies serving more-or-less the same function
differently. I have no interest in supplanting mke2fs or any other part
of e2fsprogs; I'm using those for many of the purposes they already
serve.

The short version is that I needed a library to rapidly turn
dynamically-obtained data into a set of disk blocks to be served
on-the-fly as a software-defined disk, and then mounted on the other
side of that interface by the Linux kernel. Turns out that's *many
orders of magnitude* faster than any kind of network filesystem like
NFS. It's slightly similar to a vvfat for ext4. The less blocks it can
generate and account for and cache, the faster it can run, and
microseconds matter.

ext4 has *incredible* compatibility guarantees. I'd already understood
the whole compat/ro_compat mechanism when I read through the on-disk
format documentation and the code. RO_COMPAT_SHARED_BLOCKS *seemed* like
the right semantic description of "don't ever try to write to this
filesystem because there are deduplicated blocks", and
RO_COMPAT_READONLY seemed like the right semantic description for "don't
ever try to write this, it's permanently read-only for unspecified
reasons".

If those aren't the right way to express that, I could potentially
adapt. I had a similar such conversation on linux-ext4 already (about
inline data with 128-bit inodes), which led to me choosing to abandon
128-byte inodes rather than try to get ext4 to support what I wanted
with them, because I didn't want to be disruptive to ext4 for a niche
use case. In the particular case that motivated this thread, what I was
doing already worked in previous kernels, and it seemed reasonable to
ask for it to continue to work in new kernels, while preserving the
newly added checks in the new kernels.

If the response here had been more along the lines of "could we create
and use a *different* compat flag for shared metadata and have
RO_COMPAT_SHARED_BLOCKS only mean shared data blocks", I'd be fine with
that.

If at some point I'm looking to make ext4 support more than it already
does (e.g. a way to omit bitmaps entirely, or a way to express
contiguous files with smaller extent maps, or other enhancements for
read-only filesystems), I'd be coming with architectural discussions
first, patches second, and at no point would I have the expectation that
ext4 folks need to do extra work on my behalf. I'm happy to do the work.
The *only* thing I'm asking, here, is "don't break things that worked".
And after this particular item, I'd be happy to narrow that to "don't
break things that e2fsck was previously happy with".

- Josh Triplett

Re: ext4 regression in v5.9-rc2 from e7bfb5c9bb3d on ro fs with overlapped bitmaps

2020-10-07 Thread Josh Triplett

On Tue, Oct 06, 2020 at 09:35:33AM -0400, Theodore Y. Ts'o wrote:
> On Mon, Oct 05, 2020 at 10:03:06PM -0700, Josh Triplett wrote:
> > I'm not trying to create a problem here; I'm trying to address a whole
> > family of problems. I was generally under the impression that mounting
> > existing root filesystems fell under the scope of the kernel<->userspace
> > or kernel<->existing-system boundary, as defined by what the kernel
> > accepts and existing userspace has used successfully, and that upgrading
> > the kernel should work with existing userspace and systems. If there's
> > some other rule that applies for filesystems, I'm not aware of that.
> > (I'm also not trying to suggest that every random corner case of what
> > the kernel *could* accept needs to be the format definition, but rather,
> > cases that correspond to existing userspace.)
>
> I'm not opposed to the kernel side change; it's *this time*.

I appreciate that. (Does that mean you wouldn't object to this patch
going into 5.9, with Jens' Reviewed-by and your ack?)

> I'm more interested in killing off the tool that generated the
> malformed file system in the first place.

It was developed for a reason, and it's not intended for general-purpose
use. It serves its purpose, which has nothing to do with e2fsprogs or
any component of e2fsprogs. It does not simply create filesystem images,
in the way that mke2fs or e2fsdroid does.

If I'd found any existing code that would have worked, or could have
been adapted to work, I would have happily reused it; I don't like
reinventing the wheel.

Also, see below regarding e2fsck.

> As I keep pointing out, things aren't going to go well if "e2fsck -E
> unshare_blocks" is applied to it.

These aren't intended to *ever* become writable. I've ensured that
they're marked with EXT4_FEATURE_RO_COMPAT_READONLY as well, which I
would hope implies that. Would that be sufficient to convince e2fsck
that it should never attempt to apply unshare_blocks to it?

Also, it seems like most of block_validity is trying to carefully avoid
metadata blocks intersecting with the journal, which could cause all
manner of issues; the filesystems I'm working with have no journal
(because they're permanently read-only).

With the sole exception of the shared bitmap block, I have it as a
requirement to pass e2fsck with zero complaints. If you'd be willing to
take a patch to e2fsck along the same lines as this kernel patch, then
I'd be happy to add it to the regression test suite: no filesystem
images that e2fsck is unhappy with. Would that help?

As mentioned before, I'd also be happy to supply some representative
filesystem images.

> We had a similar issue with Android.  Many years ago, Andy Rubin was
> originally quite allergic to the GPL, and had tried to promulgate the
> rule, "no GPL in Android Userspace".  This is why bionic is used as
> libc, and this resulted in Android engineers (I think before the
> Google acquisition, but I'm not 100% sure), creating an unofficial,
> "unauthorized" make_ext4fs which was a BSD-licensed version of mke2fs.

That is indeed a sad reason to develop anything. It's also particularly
sad to develop a different tool to serve exactly the same purpose as an
existing tool.

As opposed to, in this case, developing a different piece of software
to serve different purposes, that happens to make use of ext4 as one
component.

> Unfortuantely, it created file systems which the kernel would never
> complain about, but which, 50% of the time, would result in a file
> system which under some circumstances, would get corrupted (even more)
> when e2fsck attempted to repair the file system.  So if a user had a
> bit flip caused by an eMMC hiccup, e2fsck could end up making things
> worse.

I'd be interested in reading more about that, if you could suggest a
link or the right search terms. I did find
https://bugzilla.kernel.org/show_bug.cgi?id=195561 , which references the
issues in the same way you've done here, but I didn't find any of the
specific details you're mentioning here about what made the images it
created fragile.

I did see you mention, there, the advice of making sure to check
filesystems with e2fsck. That's something I've done from day one: I
didn't stop until e2fsck was happy, not just the kernel.

> So that's why I really don't like it when there are "unauthorized",
> unofficial tools creating file systems out there which we are now
> obliged to support.

ext4 is an incredibly powerful and performant filesystem, with a
reasonably well-documented format and many useful properties. Not all
software that works with ext4 is going to live in e2fsprogs, nor should
it need to.

This would be analogous to the notion that (say) the FreeBSD kernel
belongs in the e2fsprogs repository because it has an ext4 driver. The
tail wo

Re: ext4 regression in v5.9-rc2 from e7bfb5c9bb3d on ro fs with overlapped bitmaps

2020-10-06 Thread Josh Triplett

On Mon, Oct 05, 2020 at 10:03:13PM -0700, Josh Triplett wrote:
> On Mon, Oct 05, 2020 at 11:18:34PM -0400, Theodore Y. Ts'o wrote:
> > What Josh is proposing I'm pretty sure would also break "e2fsck -E
> > unshare_blocks", so that's another reason not to accept this as a
> > valid format change.
> 
> The kernel already accepted this as a valid mountable filesystem format,
> without a single error or warning of any kind, and has done so stably
> for years.
> 
> > As far as I'm concerned, contrib/e2fsdroid is the canonical definition
> > of how to create valid file systems with shared_blocks.
> 
> I'm not trying to create a problem here; I'm trying to address a whole
> family of problems. I was generally under the impression that mounting
> existing root filesystems fell under the scope of the kernel<->userspace
> or kernel<->existing-system boundary, as defined by what the kernel
> accepts and existing userspace has used successfully, and that upgrading
> the kernel should work with existing userspace and systems. If there's
> some other rule that applies for filesystems, I'm not aware of that.
> (I'm also not trying to suggest that every random corner case of what
> the kernel *could* accept needs to be the format definition, but rather,
> cases that correspond to existing userspace.)
> 
> It wouldn't be *impossible* to work around this, this time; it may be
> possible to adapt the existing userspace to work on the new and old
> kernels. My concern is, if a filesystem format accepted by previous
> kernels can be rejected by future kernels, what stops a future kernel
> from further changing the format definition or its strictness
> (co-evolving with one specific userspace) and causing further
> regressions?
> 
> I don't *want* to rely on what apparently turned out to be an
> undocumented bug in the kernel's validator. That's why I was trying to
> fix the issue in what seemed like the right way, by detecting the
> situation and turning off the validator. That seemed like it would fully
> address the issue. If it would help, I could also supply a tiny filesystem
> image for regression testing.
> 
> I'm trying to figure out what solution you'd like to see here, as long
> as it isn't "any userspace that isn't e2fsdroid can be broken at will".
> I'd be willing to work to adapt the userspace bits I have to work around
> the regression, but I'd like to get this on the radar so this doesn't
> happen again.

To clarify something further: I'm genuinely not looking to push hard on
the limits or corners of the kernel/userspace boundary here, nor do I
want to create an imposition on development. I'm happy to attempt to be
a little more flexible than most userspace. I'm trying to make
substantial, non-trivial use of the userspace side of a kernel/userspace
boundary, and within reason, I need to rely on the kernel's stability
guarantees. I'm relying on the combination of
Documentation/filesystems/ext4 and fs/ext4 as the format documentation.
The first time I discovered this issue was in doing some "there's about
to be a new kernel release" regression testing for 5.9, in which it
created a debugging adventure to track down what the problem was. I'd
like to find a good way to report and handle this kind of thing going
forward, if another issue like this arises.

- Josh Triplett

Re: ext4 regression in v5.9-rc2 from e7bfb5c9bb3d on ro fs with overlapped bitmaps

2020-10-05 Thread Josh Triplett

On Mon, Oct 05, 2020 at 11:18:34PM -0400, Theodore Y. Ts'o wrote:
> What Josh is proposing I'm pretty sure would also break "e2fsck -E
> unshare_blocks", so that's another reason not to accept this as a
> valid format change.

The kernel already accepted this as a valid mountable filesystem format,
without a single error or warning of any kind, and has done so stably
for years.

> As far as I'm concerned, contrib/e2fsdroid is the canonical definition
> of how to create valid file systems with shared_blocks.

I'm not trying to create a problem here; I'm trying to address a whole
family of problems. I was generally under the impression that mounting
existing root filesystems fell under the scope of the kernel<->userspace
or kernel<->existing-system boundary, as defined by what the kernel
accepts and existing userspace has used successfully, and that upgrading
the kernel should work with existing userspace and systems. If there's
some other rule that applies for filesystems, I'm not aware of that.
(I'm also not trying to suggest that every random corner case of what
the kernel *could* accept needs to be the format definition, but rather,
cases that correspond to existing userspace.)

It wouldn't be *impossible* to work around this, this time; it may be
possible to adapt the existing userspace to work on the new and old
kernels. My concern is, if a filesystem format accepted by previous
kernels can be rejected by future kernels, what stops a future kernel
from further changing the format definition or its strictness
(co-evolving with one specific userspace) and causing further
regressions?

I don't *want* to rely on what apparently turned out to be an
undocumented bug in the kernel's validator. That's why I was trying to
fix the issue in what seemed like the right way, by detecting the
situation and turning off the validator. That seemed like it would fully
address the issue. If it would help, I could also supply a tiny filesystem
image for regression testing.

I'm trying to figure out what solution you'd like to see here, as long
as it isn't "any userspace that isn't e2fsdroid can be broken at will".
I'd be willing to work to adapt the userspace bits I have to work around
the regression, but I'd like to get this on the radar so this doesn't
happen again.

- Josh Triplett

Re: ext4 regression in v5.9-rc2 from e7bfb5c9bb3d on ro fs with overlapped bitmaps

2020-10-05 Thread Josh Triplett

On Mon, Oct 05, 2020 at 10:36:39AM -0700, Darrick J. Wong wrote:
> On Mon, Oct 05, 2020 at 01:14:54AM -0700, Josh Triplett wrote:
> > Ran into an ext4 regression when testing upgrades to 5.9-rc kernels:
> > 
> > Commit e7bfb5c9bb3d ("ext4: handle add_system_zone() failure in
> > ext4_setup_system_zone()") breaks mounting of read-only ext4 filesystems
> > with intentionally overlapping bitmap blocks.
> > 
> > On an always-read-only filesystem explicitly marked with
> > EXT4_FEATURE_RO_COMPAT_SHARED_BLOCKS, prior to that commit, it's safe to
> > point all the block and inode bitmaps to a single block
> 
> LOL, WHAT?
> 
> I didn't know shared blocks applied to fs metadata.  I thought that
> "shared" only applied to file extent maps being able to share physical
> blocks.

The flag isn't documented very well yet, but since it specifically
forces the filesystem to always be mounted read-only, the bitmaps really
shouldn't matter at all. (In an ideal world, a permanently read-only
filesystem should be able to omit all the bitmaps entirely. Pointing
them all to a single disk block is the next best thing.)

> Could /somebody/ please document the ondisk format changes that are
> associated with this feature?

I pretty much had to sort it out by looking at a combination of
e2fsprogs and the kernel, and a lot of experimentation, until I ended up
with something that the kernel was completely happy with without a
single complaint.

I'd be happy to write up a summary of the format.

I'd still really like to see this patch applied for 5.9, to avoid having
filesystems that an old kernel can mount but a new one can't. This still
seems like a regression to me.

> > of all 1s,
> > because a read-only filesystem will never allocate or free any blocks or
> > inodes.
> 
> All 1s?  So the inode bitmap says that every inode table slot is in use,
> even if the inode record itself says it isn't?

Yes.

> What does e2fsck -n
> think about that kind of metadata inconsistency?

If you set up the rest of the metadata consistently with it (for
instance, 0 free blocks and 0 free inodes), you'll only get a single
complaint, from the e2fsck equivalent of block_validity. See
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=956509 for details on
that; with that fixed, e2fsck wouldn't complain at all. The kernel,
prior to 5.9-rc2, doesn't have a single complaint, whether at mount,
unmount, or read of every file and directory on the filesystem.

The errors you got in your e2fsck run came because you just overrode the
bitmaps, but didn't make the rest of the metadata consistent with that.
I can provide a sample filesystem if that would help.

- Josh

Re: ext4 regression in v5.9-rc2 from e7bfb5c9bb3d on ro fs with overlapped bitmaps

2020-10-05 Thread Josh Triplett

On Mon, Oct 05, 2020 at 11:46:01AM +0200, Jan Kara wrote:
> On Mon 05-10-20 01:14:54, Josh Triplett wrote:
> > Ran into an ext4 regression when testing upgrades to 5.9-rc kernels:
> > 
> > Commit e7bfb5c9bb3d ("ext4: handle add_system_zone() failure in
> > ext4_setup_system_zone()") breaks mounting of read-only ext4 filesystems
> > with intentionally overlapping bitmap blocks.
> > 
> > On an always-read-only filesystem explicitly marked with
> > EXT4_FEATURE_RO_COMPAT_SHARED_BLOCKS, prior to that commit, it's safe to
> > point all the block and inode bitmaps to a single block of all 1s,
> > because a read-only filesystem will never allocate or free any blocks or
> > inodes.
> > However, after that commit, the block validity check rejects such
> > filesystems with -EUCLEAN and "failed to initialize system zone (-117)".
> > This causes systems that previously worked correctly to fail when
> > upgrading to v5.9-rc2 or later.
> > 
> > This was obviously a bugfix, and I'm not suggesting that it should be
> > reverted; it looks like this effectively worked by accident before,
> > because the block_validity check wasn't fully functional. However, this
> > does break real systems, and I'd like to get some kind of regression fix
> > in before 5.9 final if possible. I think it would suffice to make
> > block_validity default to false if and only if
> > EXT4_FEATURE_RO_COMPAT_SHARED_BLOCKS is set.
> > 
> > Does that seem like a reasonable fix?
> 
> Well, but EXT4_FEATURE_RO_COMPAT_SHARED_BLOCKS is your internal feature
> that's not present in current upstream kernel AFAICS.

It isn't "my" feature; the value for
EXT4_FEATURE_RO_COMPAT_SHARED_BLOCKS is defined in the headers in the
e2fsprogs tree. The kernel currently does absolutely nothing with it,
nor did it previously need to; it's just an RO_COMPAT feature which
ensures that the kernel can only mount the filesystem read-only. The
point is that an always-read-only filesystem will never change the block
or inode bitmaps, so ensuring they don't overlap is unnecessary (and
harmful).

I only added EXT4_FEATURE_RO_COMPAT_SHARED_BLOCKS to the header to
generate the corresponding ext4_has_feature_shared_blocks function.

I have filesystems that previous kernels mounted and worked with just
fine, and new kernels reject. That seems like a regression to me. I'm
suggesting the simplest possible way I can see to fix that regression.

Another approach would be to default block_validity to false for any
read-only filesystem mount (since it won't be written to), but that
seemed like it'd be more invasive; I was going for a more minimal
change.

ext4 regression in v5.9-rc2 from e7bfb5c9bb3d on ro fs with overlapped bitmaps

2020-10-05 Thread Josh Triplett

Ran into an ext4 regression when testing upgrades to 5.9-rc kernels:

Commit e7bfb5c9bb3d ("ext4: handle add_system_zone() failure in
ext4_setup_system_zone()") breaks mounting of read-only ext4 filesystems
with intentionally overlapping bitmap blocks.

On an always-read-only filesystem explicitly marked with
EXT4_FEATURE_RO_COMPAT_SHARED_BLOCKS, prior to that commit, it's safe to
point all the block and inode bitmaps to a single block of all 1s,
because a read-only filesystem will never allocate or free any blocks or
inodes.

However, after that commit, the block validity check rejects such
filesystems with -EUCLEAN and "failed to initialize system zone (-117)".
This causes systems that previously worked correctly to fail when
upgrading to v5.9-rc2 or later.

This was obviously a bugfix, and I'm not suggesting that it should be
reverted; it looks like this effectively worked by accident before,
because the block_validity check wasn't fully functional. However, this
does break real systems, and I'd like to get some kind of regression fix
in before 5.9 final if possible. I think it would suffice to make
block_validity default to false if and only if
EXT4_FEATURE_RO_COMPAT_SHARED_BLOCKS is set.

Does that seem like a reasonable fix?

Here's a quick sketch of a patch, which I've tested and confirmed to
work:

- 8< -
Subject: [PATCH] Fix ext4 regression in v5.9-rc2 on ro fs with overlapped 
bitmaps

Commit e7bfb5c9bb3d ("ext4: handle add_system_zone() failure in
ext4_setup_system_zone()") breaks mounting of read-only ext4 filesystems
with intentionally overlapping bitmap blocks.

On an always-read-only filesystem explicitly marked with
EXT4_FEATURE_RO_COMPAT_SHARED_BLOCKS, prior to that commit, it's safe to
point all the block and inode bitmaps to a single block of all 1s,
because a read-only filesystem will never allocate or free any blocks or
inodes.

However, after that commit, the block validity check rejects such
filesystems with -EUCLEAN and "failed to initialize system zone (-117)".
This causes systems that previously worked correctly to fail when
upgrading to v5.9-rc2 or later.

Fix this by defaulting block_validity to off when
EXT4_FEATURE_RO_COMPAT_SHARED_BLOCKS is set.

Signed-off-by: Josh Triplett 
Fixes: e7bfb5c9bb3d ("ext4: handle add_system_zone() failure in 
ext4_setup_system_zone()")
---
 fs/ext4/ext4.h  | 2 ++
 fs/ext4/super.c | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 523e00d7b392..7874028fa864 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1834,6 +1834,7 @@ static inline bool ext4_verity_in_progress(struct inode 
*inode)
 #define EXT4_FEATURE_RO_COMPAT_METADATA_CSUM   0x0400
 #define EXT4_FEATURE_RO_COMPAT_READONLY0x1000
 #define EXT4_FEATURE_RO_COMPAT_PROJECT 0x2000
+#define EXT4_FEATURE_RO_COMPAT_SHARED_BLOCKS   0x4000
 #define EXT4_FEATURE_RO_COMPAT_VERITY  0x8000
 
 #define EXT4_FEATURE_INCOMPAT_COMPRESSION  0x0001
@@ -1930,6 +1931,7 @@ EXT4_FEATURE_RO_COMPAT_FUNCS(bigalloc,
BIGALLOC)
 EXT4_FEATURE_RO_COMPAT_FUNCS(metadata_csum,METADATA_CSUM)
 EXT4_FEATURE_RO_COMPAT_FUNCS(readonly, READONLY)
 EXT4_FEATURE_RO_COMPAT_FUNCS(project,  PROJECT)
+EXT4_FEATURE_RO_COMPAT_FUNCS(shared_blocks,SHARED_BLOCKS)
 EXT4_FEATURE_RO_COMPAT_FUNCS(verity,   VERITY)
 
 EXT4_FEATURE_INCOMPAT_FUNCS(compression,   COMPRESSION)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index ea425b49b345..f57a7e966e44 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3954,7 +3954,8 @@ static int ext4_fill_super(struct super_block *sb, void 
*data, int silent)
else
set_opt(sb, ERRORS_RO);
/* block_validity enabled by default; disable with noblock_validity */
-   set_opt(sb, BLOCK_VALIDITY);
+   if (!ext4_has_feature_shared_blocks(sb))
+   set_opt(sb, BLOCK_VALIDITY);
if (def_mount_opts & EXT4_DEFM_DISCARD)
set_opt(sb, DISCARD);

Re: [PATCH v2 0/4] Support non-blocking pidfds

2020-09-03 Thread Josh Triplett

On Wed, Sep 02, 2020 at 12:21:26PM +0200, Christian Brauner wrote:
> Hi,
> 
> Passing a non-blocking pidfd to waitid() currently has no effect, i.e.
> is not supported. There are users which would like to use waitid() on
> pidfds that are O_NONBLOCK and mix it with pidfds that are blocking and
> both pass them to waitid().
> The expected behavior is to have waitid() return -EAGAIN for
> non-blocking pidfds and to block for blocking pidfds without needing to
> perform any additional checks for flags set on the pidfd before passing
> it to waitid().
> Non-blocking pidfds will return EAGAIN from waitid() when no child
> process is ready yet. Returning -EAGAIN for non-blocking pidfds makes it
> easier for event loops that handle EAGAIN specially.
> 
> It also makes the API more consistent and uniform. In essence, waitid()
> is treated like a read on a non-blocking pidfd or a recvmsg() on a
> non-blocking socket.
> With the addition of support for non-blocking pidfds we support the same
> functionality that sockets do. For sockets() recvmsg() supports
> MSG_DONTWAIT for pidfds waitid() supports WNOHANG. Both flags are
> per-call options. In contrast non-blocking pidfds and non-blocking
> sockets are a setting on an open file description affecting all threads
> in the calling process as well as other processes that hold file
> descriptors referring to the same open file description. Both behaviors,
> per call and per open file description, have genuine use-cases.
> 
> A concrete use-case that was brought on-list (see [1]) was Josh's async
> pidfd library. Ever since the introduction of pidfds and more advanced
> async io various programming languages such as Rust have grown support
> for async event libraries. These libraries are created to help build
> epoll-based event loops around file descriptors. A common pattern is to
> automatically make all file descriptors they manage to O_NONBLOCK.
> 
> For such libraries the EAGAIN error code is treated specially. When a
> function is called that returns EAGAIN the function isn't called again
> until the event loop indicates the the file descriptor is ready.
> Supporting EAGAIN when waiting on pidfds makes such libraries just work
> with little effort.

Thanks for the patch series, Christian!

This will make it much easier to use pidfd in non-blocking event loops.

Reviewed-by: Josh Triplett 

- Josh Triplett

Re: [PATCH v2 2/4] exit: support non-blocking pidfds

2020-09-03 Thread Josh Triplett

On Wed, Sep 02, 2020 at 12:21:28PM +0200, Christian Brauner wrote:
> Passing a non-blocking pidfd to waitid() currently has no effect, i.e.  is not
> supported. There are users which would like to use waitid() on pidfds that are
> O_NONBLOCK and mix it with pidfds that are blocking and both pass them to
> waitid().
> The expected behavior is to have waitid() return -EAGAIN for non-blocking
> pidfds and to block for blocking pidfds without needing to perform any
> additional checks for flags set on the pidfd before passing it to waitid().
> Non-blocking pidfds will return EAGAIN from waitid() when no child process is
> ready yet. Returning -EAGAIN for non-blocking pidfds makes it easier for event
> loops that handle EAGAIN specially.
> 
> It also makes the API more consistent and uniform. In essence, waitid() is
> treated like a read on a non-blocking pidfd or a recvmsg() on a non-blocking
> socket.
> With the addition of support for non-blocking pidfds we support the same
> functionality that sockets do. For sockets() recvmsg() supports MSG_DONTWAIT
> for pidfds waitid() supports WNOHANG. Both flags are per-call options. In
> contrast non-blocking pidfds and non-blocking sockets are a setting on an open
> file description affecting all threads in the calling process as well as other
> processes that hold file descriptors referring to the same open file
> description. Both behaviors, per call and per open file description, have
> genuine use-cases.
> 
> The implementation should be straightforward, we simply raise the WNOHANG flag
> when a non-blocking pidfd is passed and when do_wait() returns without finding
> an eligible task and the pidfd is non-blocking we set EAGAIN.  If no child
> process exists non-blocking pidfd users will continue to see ECHILD but if
> child processes exist but have not yet exited users will see EAGAIN.
> 
> A concrete use-case that was brought on-list was Josh's async pidfd library.
> Ever since the introduction of pidfds and more advanced async io various
> programming languages such as Rust have grown support for async event
> libraries. These libraries are created to help build epoll-based event loops
> around file descriptors. A common pattern is to automatically make all file
> descriptors they manage to O_NONBLOCK.
> 
> For such libraries the EAGAIN error code is treated specially. When a function
> is called that returns EAGAIN the function isn't called again until the event
> loop indicates the the file descriptor is ready.  Supporting EAGAIN when
> waiting on pidfds makes such libraries just work with little effort.
> 
> Link: https://lore.kernel.org/lkml/20200811181236.GA18763@localhost/
> Link: https://github.com/joshtriplett/async-pidfd
> Cc: Kees Cook 
> Cc: Sargun Dhillon 
> Cc: Jann Horn 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: Oleg Nesterov 
> Cc: "Peter Zijlstra (Intel)" 
> Suggested-by: Josh Triplett 
> Signed-off-by: Christian Brauner 

With or without the discussed change to WNOHANG behavior for
compatibility:
Reviewed-by: Josh Triplett 

Also, I think you should flip the order of patches 1 and 2, so that
there isn't a one-patch window in kernel history where you can create an
O_NONBLOCK pidfd with pidfd_open but it has no effect. I'd expect
userspace to use pidfd_open accepting or EINVAL-ing the flag as an
indication of whether it'll work.

Re: [PATCH v2 2/4] exit: support non-blocking pidfds

2020-09-03 Thread Josh Triplett

On Thu, Sep 03, 2020 at 05:38:47PM +0200, Christian Brauner wrote:
> On Thu, Sep 03, 2020 at 04:22:42PM +0200, Oleg Nesterov wrote:
> > On 09/02, Christian Brauner wrote:
> > >
> > > It also makes the API more consistent and uniform. In essence, waitid() is
> > > treated like a read on a non-blocking pidfd or a recvmsg() on a 
> > > non-blocking
> > > socket.
> > > With the addition of support for non-blocking pidfds we support the same
> > > functionality that sockets do. For sockets() recvmsg() supports 
> > > MSG_DONTWAIT
> > > for pidfds waitid() supports WNOHANG.
> > 
> > What I personally do not like is that waitid(WNOHANG) returns zero or EAGAIN
> > depending on f_flags & O_NONBLOCK... This doesn't match 
> > recvmsg(MSG_DONTWAIT)
> > and doesn't look consistent to me.
> 
> It's not my favorite thing either but async event loops are usually
> modeled around EAGAIN so I think this has benefits. Josh can speak more
> to that.

I wouldn't expect the same application to use both WNOHANG and
O_NONBLOCK, since the latter makes the former unnecessary. I'd have no
objection if WNOHANG continued to have the same "return 0 and you have
to check the structure to figure out what that means" behavior
regardless of the fd flags, for compatibility with an application or
library that expects that behavior with WNOHANG and didn't expect the
return value to change with a non-blocking fd.  waitid could just return
EAGAIN on a non-blocking fd if *not* passed WNOHANG, which would make
pidfd Just Work in non-blocking event loops.

Re: [PATCH v2 1/4] pidfd: support PIDFD_NONBLOCK in pidfd_open()

2020-09-03 Thread Josh Triplett

On Wed, Sep 02, 2020 at 12:21:27PM +0200, Christian Brauner wrote:
> Introduce PIDFD_NONBLOCK to support non-blocking pidfd file descriptors.
> 
> Ever since the introduction of pidfds and more advanced async io various
> programming languages such as Rust have grown support for async event
> libraries. These libraries are created to help build epoll-based event loops
> around file descriptors. A common pattern is to automatically make all file
> descriptors they manage to O_NONBLOCK.
> 
> For such libraries the EAGAIN error code is treated specially. When a function
> is called that returns EAGAIN the function isn't called again until the event
> loop indicates the the file descriptor is ready. Supporting EAGAIN when
> waiting on pidfds makes such libraries just work with little effort. In the
> following patch we will extend waitid() internally to support non-blocking
> pidfds.
> 
> This introduces a new flag PIDFD_NONBLOCK that is equivalent to O_NONBLOCK.
> This follows the same patterns we have for other (anon inode) file descriptors
> such as EFD_NONBLOCK, IN_NONBLOCK, SFD_NONBLOCK, TFD_NONBLOCK and the same for
> close-on-exec flags.
> 
> Link: https://lore.kernel.org/lkml/20200811181236.GA18763@localhost/
> Link: https://github.com/joshtriplett/async-pidfd
> Cc: Kees Cook 
> Cc: Sargun Dhillon 
> Cc: Oleg Nesterov 
> Suggested-by: Josh Triplett 
> Signed-off-by: Christian Brauner 

Reviewed-by: Josh Triplett

Re: pidfd and O_NONBLOCK

2020-08-11 Thread Josh Triplett

On Tue, Aug 11, 2020 at 10:10:45PM +0200, Christian Brauner wrote:
> On Tue, Aug 11, 2020 at 11:12:36AM -0700, Josh Triplett wrote:
> > As far as I can tell, O_NONBLOCK has no effect on a pidfd. When calling
> > waitid on a pidfd for a running process, it always blocks unless you
> > provide WNOHANG.
> > 
> > I don't think anything depends on that behavior. Would it be possible to
> > make O_NONBLOCK on a pidfd cause waitid on a running process to return
> > EWOULDBLOCK?
> > 
> > This would make it easier to use pidfd in some non-blocking event loops.
> 
> Hey Josh,
> 
> Just to see I did a _horrible_ draft (cf. [1]) and it seems doable to me
> and if you can provide a good rationale and a use-case then I think that
> would be ok.

Rationale and use case: there are some non-blocking event loop
libraries, such as the Rust async-io library, that help build
epoll-based event loops around file descriptors. Those libraries
automatically set O_NONBLOCK on the file descriptors they manage, and
they treat EWOULDBLOCK errno codes specially, with semantics like "call
this function, if it returns EWOULDBLOCK then don't call it again until
epoll says the fd is ready". If setting O_NONBLOCK on pidfd caused
waitid to return EWOULDBLOCK, such libraries would Just Work with very
little effort.

Also, pidfd_open should accept O_NONBLOCK as a flag, which in addition
to saving a call to fcntl would allow userspace to detect if this works.
(Even if you want to use fcntl to set it later, you can always just open
your own PID with pidfd_open and check if you get EINVAL to know that
your kernel doesn't support this.)

Thanks,
Josh Triplett

pidfd and O_NONBLOCK

2020-08-11 Thread Josh Triplett

As far as I can tell, O_NONBLOCK has no effect on a pidfd. When calling
waitid on a pidfd for a running process, it always blocks unless you
provide WNOHANG.

I don't think anything depends on that behavior. Would it be possible to
make O_NONBLOCK on a pidfd cause waitid on a running process to return
EWOULDBLOCK?

This would make it easier to use pidfd in some non-blocking event loops.

- Josh Triplett

Re: inherit TAINT_PROPRIETARY_MODULE v2

2020-08-01 Thread Josh Triplett

On July 31, 2020 11:53:08 PM PDT, Christoph Hellwig  wrote:
>[note: private reply now to start a flame fest with the usual suspects]

[You still CCed LKML.]

>On Fri, Jul 31, 2020 at 01:11:46PM -0700, j...@joshtriplett.org wrote:
>> Christoph Hellwig wrote:
>> > we've had a bug in our resolution of _GPL modules since day one, that
>> > is a module can claim to be GPL licensed and use _GPL exports, while
>> > it also depends on symbols from non-GPL modules.  This is used as a
>> > circumvention of the _GPL exports by using a small shim module using
>> > the _GPL exports and the other functionality.
>> 
>> This looks great. You might also consider doing the reverse: if a module
>> imports any EXPORT_SYMBOL_GPL symbols, any symbols that module in turn
>> exports shouldn't be importable by any module that doesn't explicitly
>> claim to be GPL-compatible. Effectively, if a module imports any
>> EXPORT_SYMBOL_GPL symbols, all of its exported symbols would then be
>> treated as EXPORT_SYMBOL_GPL.
>> 
>> This would catch the case of attempting to "wrap" EXPORT_SYMBOL_GPL
>> symbols in the other direction, by re-exporting the same or similar
>> functions to another module. (This would help catch mistakes, not just
>> intentional malice.)
>
>I'd personally 100% agree with that, but I'd rather clear it with Linus
>privately first.  This would basically make most of the usual
>modular subsystems unavailable to proprietary modules as all of them
>use _GPL driver core exports, and I suspect he'd cave into the screaming.

As a start, what about applying that logic specifically to out-of-tree modules? 
That would address the shim problem. The justification would be that in-tree 
modules have at least gone through some level of review on what they're 
exporting.

(Standard disclaimer: suggesting enhancements to the symbol licensing framework 
should not be taken as implicit endorsement of any legitimacy for 
non-GPL-compatible modules.)

Re: Linux kernel in-tree Rust support

2020-07-29 Thread Josh Triplett

On Tue, Jul 28, 2020 at 10:40:38PM +0200, Pavel Machek wrote:
> > We just need to make sure that any kernel CI infrastructure tests that
> > right away, then, so that failures don't get introduced by a patch from
> > someone without a Rust toolchain and not noticed until someone with a
> > Rust toolchain tests it.
> 
> So... I'm fan of Rust, but while trying to use it one thing was obvious: it
> takes _significantly_ longer than C to compile and needs gigabyte a lot of 
> RAM.
> 
> Kernel is quite big project, can CI infrastructure handle additional load?
> 
> Will developers see significantly longer compile times when Rust is 
> widespread?

I wouldn't expect the addition of Rust to the kernel to substantially
impact overall build time; on balance, I'd expect the major bottleneck
in kernel builds to continue to be linking and other serialized steps,
not compiling and other highly parallel steps.

There are also *many* things that can be done to improve Rust build time
in a project. And I don't expect that in-kernel Rust will have many
dependencies on third-party crates (since they'd need to be checked into
the tree).

Re: Linux kernel in-tree Rust support

2020-07-16 Thread Josh Triplett

On Thu, Jul 16, 2020 at 03:06:01PM +0200, Arnd Bergmann wrote:
> On Sun, Jul 12, 2020 at 9:39 PM Josh Triplett  wrote:
> > On Sun, Jul 12, 2020 at 03:31:51PM +0300, Adrian Bunk wrote:
> > >
> > > As an example:
> > > Ubuntu LTS releases upgrade to a new Rust version every 1-2 months.
> > > Ubuntu 16.04 started with Rust 1.7.0 and is now at Rust 1.41.0.
> > >
> > > It would not sound good to me if security updates of distribution
> > > kernels might additionally end up using a different version of the
> > > Rust compiler - the toolchain for the kernel should be stable.
> > >
> > > Would Rust usage in the kernel require distributions to ship
> > > a "Rust for Firefox" and a "Rust for the kernel"?
> >
> > Rust has hard stability guarantees when upgrading from one stable
> > version to the next. If code compiles with a given stable version of
> > Rust, it'll compile with a newer stable version of Rust. Given that, a
> > stable distribution will just need a single sufficiently up-to-date Rust
> > that meets the minimum version requirements of both Firefox and Linux.
> >
> > (That would not apply if the kernel used nightly Rust, since
> > nightly-only features are allowed to change before becoming stable;
> > that's one reason why we should use stable Rust, and try to get Firefox
> > to stick to stable Rust.)
> 
> I would expect we'd want a fairly tight coupling between kernel
> releases and minimum rust releases at first. Whatever is the latest
> stable rust version during the kernel's merge window might be
> assumed to be the minimum version for the life of that kernel, but
> an LTS release would not suddenly start relying on features
> from a newer compiler (thought it might warn about known bugs).

Agreed; we should be careful that any backported fix to an LTS kernel
does not increase the required Rust toolchain.

> While Linux used to build with 12 year old compilers (4.1 until
> 2018), we now require a 6 year old gcc (4.9) or 1 year old
> clang/llvm. I don't know whether these will fully converge over
> time but it seems sensible that the minimum rust frontend version
> we require for a new kernel release would eventually also fall
> in that range, requiring a compiler that is no more than a few
> years old, but not requiring the latest stable release.

I expect in the short term that we will likely have a need for features
from recent stable releases, especially when those features were added
specifically to support the kernel or similar, but the rate at which we
need new features will slow over time, and eventually we'll go from
"latest stable" to "within a year or so".

Re: Linux kernel in-tree Rust support

2020-07-13 Thread Josh Triplett

On Mon, Jul 13, 2020 at 01:11:13PM -0500, Eric W. Biederman wrote:
> Nick Desaulniers  writes:
> 
> > Hello folks,
> > I'm working on putting together an LLVM "Micro Conference" for the
> > upcoming Linux Plumbers Conf
> > (https://www.linuxplumbersconf.org/event/7/page/47-attend).  It's not
> > solidified yet, but I would really like to run a session on support
> > for Rust "in tree."  I suspect we could cover technical aspects of
> > what that might look like (I have a prototype of that, was trivial to
> > wire up KBuild support), but also a larger question of "should we do
> > this?" or "how might we place limits on where this can be used?"
> >
> > Question to folks explicitly in To:, are you planning on attending plumbers?
> >
> > If so, would this be an interesting topic that you'd participate in?
> 
> I have two big concerns about actually using rust.
> 
> 1) How large is the rust language support, and will each rust module
>need to duplicate it.  I seem to remember someone mentioning it is
>noticable in size.

There should only be a single copy, which could either be in the kernel
(if there's Rust code compiled into the kernel) or be in a "rust.ko"
support module. As long as you use the same version of Rust for the
kernel and all modules, you can supply symbols dynamically.

There are a few other steps we can take to control code size, as well.
In particular, there are tradeoffs between performance and size, such as
the amount of code in generics vs non-generics. (We also need to get
some further optimizations into Rust on that front, such as tail
merging.)

As for the size overall, given that we'll just be providing the portions
of "core" and "alloc" that the built code actually uses, and likely not
providing "std" at all, I would expect the size to remain relatively
small.

I very much care about overall kernel size, and I'm happy to help make
sure it remains reasonable.

> 2) What is rust usable for?  The rust type system will not admit
>doubly linked lists (or anything where two pointers point at the
>same memory) unless you are using an unsafe block.

There are libraries like intrusive-collections which implement
kernel-style structures with potentially multiple nodes in the structure
that put them into multiple lists at once. I would expect us to use
those (or in some cases implement our own). They don't need to be
written in C, just unsafe Rust that's wrapped up in a safe interface.

Just as the kernel has a variety of higher-level interfaces and data
structures that make working in the kernel sometimes *easier* than the
average C program, I'd expect that we'll end up with common Rust
structures that do what people need, rather than people reimplementing
their own.

- Josh Triplett

Re: Linux kernel in-tree Rust support

2020-07-12 Thread Josh Triplett

On Sun, Jul 12, 2020 at 03:31:51PM +0300, Adrian Bunk wrote:
> On Thu, Jul 09, 2020 at 11:41:47AM -0700, Nick Desaulniers wrote:
> >...
> > but also a larger question of "should we do
> > this?" or "how might we place limits on where this can be used?"
> >...
> 
> I won't attend, but I do have a topic that should be covered:
> 
> Firefox always depends on recent Rust, which forces distributions to 
> update Rust in stable releases.
> 
> As an example:
> Ubuntu LTS releases upgrade to a new Rust version every 1-2 months.
> Ubuntu 16.04 started with Rust 1.7.0 and is now at Rust 1.41.0.
> 
> It would not sound good to me if security updates of distribution
> kernels might additionally end up using a different version of the
> Rust compiler - the toolchain for the kernel should be stable.
> 
> Would Rust usage in the kernel require distributions to ship
> a "Rust for Firefox" and a "Rust for the kernel"?

Rust has hard stability guarantees when upgrading from one stable
version to the next. If code compiles with a given stable version of
Rust, it'll compile with a newer stable version of Rust. Given that, a
stable distribution will just need a single sufficiently up-to-date Rust
that meets the minimum version requirements of both Firefox and Linux.

(That would not apply if the kernel used nightly Rust, since
nightly-only features are allowed to change before becoming stable;
that's one reason why we should use stable Rust, and try to get Firefox
to stick to stable Rust.)

Re: Linux kernel in-tree Rust support

2020-07-11 Thread Josh Triplett

On Fri, Jul 10, 2020 at 04:54:11PM -0700, Linus Torvalds wrote:
> On Fri, Jul 10, 2020 at 3:59 PM Josh Triplett  wrote:
> > As I recall, Greg's biggest condition for initial introduction of this
> > was to do the same kind of "turn this Kconfig option on and turn an
> > option under it off" trick that LTO uses, so that neither "make
> > allnoconfig" nor "make allyesconfig" would require Rust until we've had
> > plenty of time to experiment with it.
> 
> No, please make it a "is rust available" automatic config option. The
> exact same way we already do the compiler versions and check for
> various availability of compiler flags at config time.

That sounds even better, and will definitely allow for more testing.

We just need to make sure that any kernel CI infrastructure tests that
right away, then, so that failures don't get introduced by a patch from
someone without a Rust toolchain and not noticed until someone with a
Rust toolchain tests it.

- Josh

Re: Linux kernel in-tree Rust support

2020-07-10 Thread Josh Triplett

On Fri, Jul 10, 2020 at 02:50:22PM +0200, Christian Brauner wrote:
> On Fri, Jul 10, 2020 at 08:28:03AM +0200, Greg KH wrote:
> > On Thu, Jul 09, 2020 at 11:41:47AM -0700, Nick Desaulniers wrote:
> > > Hello folks,
> > > I'm working on putting together an LLVM "Micro Conference" for the
> > > upcoming Linux Plumbers Conf
> > > (https://www.linuxplumbersconf.org/event/7/page/47-attend).  It's not
> > > solidified yet, but I would really like to run a session on support
> > > for Rust "in tree."  I suspect we could cover technical aspects of
> > > what that might look like (I have a prototype of that, was trivial to
> > > wire up KBuild support), but also a larger question of "should we do
> > > this?" or "how might we place limits on where this can be used?"
> > > 
> > > Question to folks explicitly in To:, are you planning on attending 
> > > plumbers?
> > > 
> > > If so, would this be an interesting topic that you'd participate in?
> > 
> > Yes, I'll be there.
> 
> We actually had this dicussion a while back and there were some more
> people interested in this. I'd be interested to attend this and I've
> spoken with Kees and a few others about this topic at last Plumbers (I
> think Greg might have been around for this informal discussion as well.
> But I might be imagining things.).

I was around for one of the informal conversations with Greg and Kees
and others.

As I recall, Greg's biggest condition for initial introduction of this
was to do the same kind of "turn this Kconfig option on and turn an
option under it off" trick that LTO uses, so that neither "make
allnoconfig" nor "make allyesconfig" would require Rust until we've had
plenty of time to experiment with it. And that seems entirely
reasonable to me too.

- Josh

Re: Linux kernel in-tree Rust support

2020-07-09 Thread Josh Triplett

On Thu, Jul 09, 2020 at 11:41:47AM -0700, Nick Desaulniers wrote:
> Hello folks,
> I'm working on putting together an LLVM "Micro Conference" for the
> upcoming Linux Plumbers Conf
> (https://www.linuxplumbersconf.org/event/7/page/47-attend).  It's not
> solidified yet, but I would really like to run a session on support
> for Rust "in tree."  I suspect we could cover technical aspects of
> what that might look like (I have a prototype of that, was trivial to
> wire up KBuild support), but also a larger question of "should we do
> this?" or "how might we place limits on where this can be used?"
> 
> Question to folks explicitly in To:, are you planning on attending plumbers?
> 
> If so, would this be an interesting topic that you'd participate in?

I hadn't planned to attend the virtual event, but this sounds like a
topic I absolutely have to attend. Please follow up if this proposal
gets accepted.

I'd love to see a path to incorporating Rust into the kernel, as long as
we can ensure that:
- There are appropriate Rustic interfaces that are natural and safe to
  use (not just C FFI, and not *just* trivial transformations like
  slices instead of buffer+len pairs).
- Those Rustic interfaces are easy to maintain and evolve with the kernel.
- We provide compelling use cases that go beyond just basic safety, such
  as concurrency checking, or lifetimes for object ownership.
- We make Rust fit naturally into the kernel's norms and standards,
  while also introducing some of Rust's norms and standards where they
  make sense. (We want to fit into the kernel, and at the same time, we
  don't want to hastily saw off all the corners that don't immediately
  fit, because some of those corners provide value. Let's take our
  time.)
- We move slowly and carefully, making sure it's a gradual introduction,
  and give people time to incorporate the Rust toolchain into their
  kernel workflows.

Also, with my "Rust language team lead" hat on, I'd be happy to have the
Linux kernel feeding into Rust language development priorities. If
building Rustic interfaces within the kernel requires some additional
language features, we should see what enhancements to the language would
best serve those requirements. I've often seen the sentiment that
co-evolving Linux and a C compiler would be beneficial for both; I think
the same would be true of Linux and the Rust compiler.

Re: [PATCH 00/18] Allow architectures to override __READ_ONCE()

2020-07-01 Thread Josh Triplett

On Tue, Jun 30, 2020 at 06:37:16PM +0100, Will Deacon wrote:
> The patches allow architectures to provide their own implementation of
> __READ_ONCE(). This serves two main purposes:
> 
>   1. It finally allows us to remove [smp_]read_barrier_depends() from the
>  Linux memory model and make it an implementation detail of the Alpha
>  back-end.

And there was much rejoicing. Thank you.

[PATCH v2] serial: 8250: Enable 16550A variants by default on non-x86

2020-05-26 Thread Josh Triplett

Some embedded devices still use these serial ports; make sure they're
still enabled by default on architectures more likely to have them, to
avoid rendering someone's console unavailable.

Reported-by: Vladimir Oltean 
Reported-by: Maxim Kochetkov 
Fixes: dc56ecb81a0a ("serial: 8250: Support disabling mdelay-filled probes of 
16550A variants")
Signed-off-by: Josh Triplett 
---

v2: Add Reported-by lines.

 drivers/tty/serial/8250/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/tty/serial/8250/Kconfig b/drivers/tty/serial/8250/Kconfig
index af0688156dd0..8195a31519ea 100644
--- a/drivers/tty/serial/8250/Kconfig
+++ b/drivers/tty/serial/8250/Kconfig
@@ -63,6 +63,7 @@ config SERIAL_8250_PNP
 config SERIAL_8250_16550A_VARIANTS
bool "Support for variants of the 16550A serial port"
depends on SERIAL_8250
+   default !X86
help
  The 8250 driver can probe for many variants of the venerable 16550A
  serial port. Doing so takes additional time at boot.
-- 
2.27.0.rc0

Re: [PATCH] serial: 8250: Enable 16550A variants by default on non-x86

2020-05-26 Thread Josh Triplett

On Tue, May 26, 2020 at 11:47:44AM +0200, Greg Kroah-Hartman wrote:
> On Tue, May 26, 2020 at 01:40:06AM -0700, Josh Triplett wrote:
> > Some embedded devices still use these serial ports; make sure they're
> > still enabled by default on architectures more likely to have them, to
> > avoid rendering someone's console unavailable.
> > 
> > Fixes: dc56ecb81a0a ("serial: 8250: Support disabling mdelay-filled probes 
> > of 16550A variants")
> > Signed-off-by: Josh Triplett 
> > ---
> > 
> > Based on user reports from embedded devices that need these variants.
> 
> No "reported-by:" lines for these people?  Not even the author of the
> other competing patch for this issue?

Genuinely forgot that the "Reported-by" tag existed; thank you for the
reminder. Will fix.

- Josh Triplett

[PATCH] serial: 8250: Enable 16550A variants by default on non-x86

2020-05-26 Thread Josh Triplett

Some embedded devices still use these serial ports; make sure they're
still enabled by default on architectures more likely to have them, to
avoid rendering someone's console unavailable.

Fixes: dc56ecb81a0a ("serial: 8250: Support disabling mdelay-filled probes of 
16550A variants")
Signed-off-by: Josh Triplett 
---

Based on user reports from embedded devices that need these variants.

 drivers/tty/serial/8250/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/tty/serial/8250/Kconfig b/drivers/tty/serial/8250/Kconfig
index af0688156dd0..8195a31519ea 100644
--- a/drivers/tty/serial/8250/Kconfig
+++ b/drivers/tty/serial/8250/Kconfig
@@ -63,6 +63,7 @@ config SERIAL_8250_PNP
 config SERIAL_8250_16550A_VARIANTS
bool "Support for variants of the 16550A serial port"
depends on SERIAL_8250
+   default !X86
help
  The 8250 driver can probe for many variants of the venerable 16550A
  serial port. Doing so takes additional time at boot.
-- 
2.27.0.rc0

Re: [PATCH] serial: 8250: probe all 16550A variants by default

2020-05-26 Thread Josh Triplett

On Tue, May 26, 2020 at 10:05:25AM +0300, Maxim Kochetkov wrote:
> This change breaks all my devices: OMAP-L138 (davinci based), LS1021A,
> T1040, Marvell (kirkwood2 based). Only enabling VARIANTS on all my devices
> fix the issue.

Preparing a patch right now, with the appropriate Fixes tag so it should
end up on any kernel that has the original patch.

- Josh Triplett

Re: [PATCH] serial: 8250: probe all 16550A variants by default

2020-05-25 Thread Josh Triplett

On Mon, May 25, 2020 at 09:52:54PM +0300, Vladimir Oltean wrote:
> Hi Josh,
> 
> On Mon, 25 May 2020 at 20:28, Josh Triplett  wrote:
> >
> > On Mon, May 25, 2020 at 04:02:38PM +0300, Vladimir Oltean wrote:
> > > On NXP T1040, the UART is typically detected as 16550A_FSL64. After said
> > > patch, it gets detected as plain 16550A and the Linux console is
> > > completely garbled and missing characters.
> >
> > Interesting that there's *new* powerpc hardware that needs these
> > variants. I based the patch on the fact that, on x86 at least, hardware
> > using these variants hasn't been made for a long time.
> >
> > In the hopes of preserving at least part of the benefit of the patch,
> > could you please change it to `default y if !X86_64`?
> >
> 
> Why don't you add CONFIG_SERIAL_8250_16550A_VARIANTS=n in x86_64_defconfig?

In general, it seems preferable to me when the defconfig files differ
less from the defaults encoded in Kconfig.

You're proposing a change to the default; could you please include one
or the other additional change in your patch to preserve the behavior on
x86_64?

> > >  drivers/tty/serial/8250/Kconfig | 1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > diff --git a/drivers/tty/serial/8250/Kconfig 
> > > b/drivers/tty/serial/8250/Kconfig
> > > index af0688156dd0..89c7ecb55619 100644
> > > --- a/drivers/tty/serial/8250/Kconfig
> > > +++ b/drivers/tty/serial/8250/Kconfig
> > > @@ -63,6 +63,7 @@ config SERIAL_8250_PNP
> > >  config SERIAL_8250_16550A_VARIANTS
> > >   bool "Support for variants of the 16550A serial port"
> > >   depends on SERIAL_8250
> > > + default y
> > >   help
> > > The 8250 driver can probe for many variants of the venerable 
> > > 16550A
> > > serial port. Doing so takes additional time at boot.
> > > --
> > > 2.25.1
> > >
> 
> Thanks,
> -Vladimir

Re: [PATCH] serial: 8250: probe all 16550A variants by default

2020-05-25 Thread Josh Triplett

On Mon, May 25, 2020 at 04:02:38PM +0300, Vladimir Oltean wrote:
> On NXP T1040, the UART is typically detected as 16550A_FSL64. After said
> patch, it gets detected as plain 16550A and the Linux console is
> completely garbled and missing characters.

Interesting that there's *new* powerpc hardware that needs these
variants. I based the patch on the fact that, on x86 at least, hardware
using these variants hasn't been made for a long time.

In the hopes of preserving at least part of the benefit of the patch,
could you please change it to `default y if !X86_64`?

>  drivers/tty/serial/8250/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/tty/serial/8250/Kconfig b/drivers/tty/serial/8250/Kconfig
> index af0688156dd0..89c7ecb55619 100644
> --- a/drivers/tty/serial/8250/Kconfig
> +++ b/drivers/tty/serial/8250/Kconfig
> @@ -63,6 +63,7 @@ config SERIAL_8250_PNP
>  config SERIAL_8250_16550A_VARIANTS
>   bool "Support for variants of the 16550A serial port"
>   depends on SERIAL_8250
> + default y
>   help
> The 8250 driver can probe for many variants of the venerable 16550A
> serial port. Doing so takes additional time at boot.
> -- 
> 2.25.1
>

Re: [PATCH v2 0/7] padata: parallelize deferred page init

2020-05-21 Thread Josh Triplett

On Wed, May 20, 2020 at 02:26:38PM -0400, Daniel Jordan wrote:
> Please review and test, and thanks to Alex, Andrew, Josh, and Pavel for
> their feedback in the last version.

I re-tested v2:

Tested-by: Josh Triplett 

[0.231435] node 1 initialised, 24189223 pages in 32ms
[0.236718] node 0 initialised, 23398907 pages in 36ms

Re: [PATCH 2/3] security: add symbol namespace for reading file data

2020-05-13 Thread Josh Triplett

On Wed, May 13, 2020 at 04:16:22PM +, Luis Chamberlain wrote:
> On Wed, May 13, 2020 at 10:40:31AM -0500, Eric W. Biederman wrote:
> > Luis Chamberlain  writes:
> > 
> > > Certain symbols are not meant to be used by everybody, the security
> > > helpers for reading files directly is one such case. Use a symbol
> > > namespace for them.
> > >
> > > This will prevent abuse of use of these symbols in places they were
> > > not inteded to be used, and provides an easy way to audit where these
> > > types of operations happen as a whole.
> > 
> > Why not just remove the ability for the firmware loader to be a module?
> > 
> > Is there some important use case that requires the firmware loader
> > to be a module?
> > 
> > We already compile the code in by default.  So it is probably just
> > easier to remove the modular support all together.  Which would allow
> > the export of the security hooks to be removed as well.
> 
> Yeah, that's a better solution. The only constaint I am aware of is
> we *cannot* change the name of the module from firmware_class since the
> old fallback sysfs loader depends on the module name. So, so long as we
> take care with that on built-in and document this very well, I think
> we should be good.
> 
> I checked the commit logs and this was tristate since the code was added
> upstream, so I cannot see any good reason it was enabled as modular.
> 
> Speaking with a *backports experience* hat on, we did have a use case
> to use a module for it in case a new feature was added upstream which
> was not present on older kernels. However I think that using a separate
> symbol prefix would help with that.
> 
> Would any Android stakeholders / small / embedded folks whave any issue
> with this?

As long as you can still *completely* compile out firmware loading, I
don't think there's a huge use case for making it modular. y/n seems
fine.

Re: [PATCH 6/7] mm: parallelize deferred_init_memmap()

2020-05-04 Thread Josh Triplett

On May 4, 2020 3:33:58 PM PDT, Alexander Duyck  
wrote:
>On Thu, Apr 30, 2020 at 1:12 PM Daniel Jordan
> wrote:
>> /*
>> -* Initialize and free pages in MAX_ORDER sized increments so
>> -* that we can avoid introducing any issues with the buddy
>> -* allocator.
>> +* More CPUs always led to greater speedups on tested
>systems, up to
>> +* all the nodes' CPUs.  Use all since the system is
>otherwise idle now.
>>  */
>
>I would be curious about your data. That isn't what I have seen in the
>past. Typically only up to about 8 or 10 CPUs gives you any benefit,
>beyond that I was usually cache/memory bandwidth bound.

I've found pretty much linear performance up to memory bandwidth, and on the 
systems I was testing, I didn't saturate memory bandwidth until about the full 
number of physical cores. From number of cores up to number of threads, the 
performance stayed about flat; it didn't get any better or worse.

- Josh

Re: [PATCH 0/7] padata: parallelize deferred page init

2020-04-30 Thread Josh Triplett

On Thu, Apr 30, 2020 at 04:11:18PM -0400, Daniel Jordan wrote:
> Sometimes the kernel doesn't take full advantage of system memory
> bandwidth, leading to a single CPU spending excessive time in
> initialization paths where the data scales with memory size.
> 
> Multithreading naturally addresses this problem, and this series is the
> first step.
> 
> It extends padata, a framework that handles many parallel singlethreaded
> jobs, to handle multithreaded jobs as well by adding support for
> splitting up the work evenly, specifying a minimum amount of work that's
> appropriate for one helper thread to do, load balancing between helpers,
> and coordinating them.  More documentation in patches 4 and 7.
> 
> The first user is deferred struct page init, a large bottleneck in
> kernel boot--actually the largest for us and likely others too.  This
> path doesn't require concurrency limits, resource control, or priority
> adjustments like future users will (vfio, hugetlb fallocate, munmap)
> because it happens during boot when the system is otherwise idle and
> waiting on page init to finish.
> 
> This has been tested on a variety of x86 systems and speeds up kernel
> boot by 6% to 49% by making deferred init 63% to 91% faster.  Patch 6
> has detailed numbers.  Test results from other systems appreciated.
> 
> This series is based on v5.6 plus these three from mmotm:
> 
>   mm-call-touch_nmi_watchdog-on-max-order-boundaries-in-deferred-init.patch
>   mm-initialize-deferred-pages-with-interrupts-enabled.patch
>   mm-call-cond_resched-from-deferred_init_memmap.patch
> 
> All of the above can be found in this branch:
> 
>   git://oss.oracle.com/git/linux-dmjordan.git padata-mt-definit-v1
>   
> https://oss.oracle.com/git/gitweb.cgi?p=linux-dmjordan.git;a=shortlog;h=refs/heads/padata-mt-definit-v1

For the series (and the three prerequisite patches):

Tested-by: Josh Triplett 

Thank you for writing this, and thank you for working towards
upstreaming it!

Re: [PATCH 0/7] padata: parallelize deferred page init

2020-04-30 Thread Josh Triplett

On Thu, Apr 30, 2020 at 02:31:31PM -0700, Andrew Morton wrote:
> On Thu, 30 Apr 2020 16:11:18 -0400 Daniel Jordan  
> wrote:
> > Sometimes the kernel doesn't take full advantage of system memory
> > bandwidth, leading to a single CPU spending excessive time in
> > initialization paths where the data scales with memory size.
> > 
> > Multithreading naturally addresses this problem, and this series is the
> > first step.
> > 
> > It extends padata, a framework that handles many parallel singlethreaded
> > jobs, to handle multithreaded jobs as well by adding support for
> > splitting up the work evenly, specifying a minimum amount of work that's
> > appropriate for one helper thread to do, load balancing between helpers,
> > and coordinating them.  More documentation in patches 4 and 7.
> > 
> > The first user is deferred struct page init, a large bottleneck in
> > kernel boot--actually the largest for us and likely others too.  This
> > path doesn't require concurrency limits, resource control, or priority
> > adjustments like future users will (vfio, hugetlb fallocate, munmap)
> > because it happens during boot when the system is otherwise idle and
> > waiting on page init to finish.
> > 
> > This has been tested on a variety of x86 systems and speeds up kernel
> > boot by 6% to 49% by making deferred init 63% to 91% faster.
> 
> How long is this up-to-91% in seconds?  If it's 91% of a millisecond
> then not impressed.  If it's 91% of two weeks then better :)

Some test results on a system with 96 CPUs and 192GB of memory:

Without this patch series:
[0.487132] node 0 initialised, 23398907 pages in 292ms
[0.499132] node 1 initialised, 24189223 pages in 304ms
...
[0.629376] Run /sbin/init as init process

With this patch series:
[0.227868] node 0 initialised, 23398907 pages in 28ms
[0.230019] node 1 initialised, 24189223 pages in 28ms
...
[0.361069] Run /sbin/init as init process

That makes a huge difference; memory initialization is the largest
remaining component of boot time.

> Relatedly, how important is boot time on these large machines anyway? 
> They presumably have lengthy uptimes so boot time is relatively
> unimportant?

Cloud systems and other virtual machines may have a huge amount of
memory but not necessarily run for a long time; on such systems, boot
time becomes critically important. Reducing boot time on cloud systems
and VMs makes the difference between "leave running to reduce latency"
and "just boot up when needed".

- Josh Triplett

Re: [PATCH v4 0/5] init: Do not select DEBUG_KERNEL by default

2019-04-11 Thread Josh Triplett

On Thu, Apr 11, 2019 at 11:13:42PM -0400, Sinan Kaya wrote:
> On 4/11/2019 11:02 PM, Josh Triplett wrote:
> > I noticed one minor typo in patch 1/5, with that fixed, for the whole
> > series:
> 
> Can you point to the typo?

I did, in my response to the patch itself:
s/Miscellaneous/miscellaneous/ in the new option's description, since it
isn't at the start of a sentence.

Re: [PATCH v4 0/5] init: Do not select DEBUG_KERNEL by default

2019-04-11 Thread Josh Triplett

On Fri, Apr 12, 2019 at 01:43:50AM +, Sinan Kaya wrote:
> CONFIG_DEBUG_KERNEL has been designed to just enable Kconfig options.
> Kernel code generatoin should not depend on CONFIG_DEBUG_KERNEL.
> 
> Proposed alternative plan: let's add a new symbol, something like
> DEBUG_MISC ("Miscellaneous debug code that should be under a more
> specific debug option but isn't"), make it depend on DEBUG_KERNEL and be
> "default DEBUG_KERNEL" but allow itself to be turned off, and then
> mechanically change the small handful of "#ifdef CONFIG_DEBUG_KERNEL" to
> "#ifdef CONFIG_DEBUG_MISC".
> 
> Sinan Kaya (5):
>   init: Introduce DEBUG_MISC option
>   powerpc: Replace CONFIG_DEBUG_KERNEL with CONFIG_DEBUG_MISC
>   mips: Replace CONFIG_DEBUG_KERNEL with CONFIG_DEBUG_MISC
>   xtensa: Replace CONFIG_DEBUG_KERNEL with CONFIG_DEBUG_MISC
>   net: Replace CONFIG_DEBUG_KERNEL with CONFIG_DEBUG_MISC

I noticed one minor typo in patch 1/5, with that fixed, for the whole
series:

Reviewed-by: Josh Triplett

Re: [PATCH v4 1/5] init: Introduce DEBUG_MISC option

2019-04-11 Thread Josh Triplett

On Fri, Apr 12, 2019 at 01:43:51AM +, Sinan Kaya wrote:
> Introduce DEBUG_MISC ("Miscellaneous debug code that should be under a more
> specific debug option but isn't"), make it depend on DEBUG_KERNEL and be
> "default DEBUG_KERNEL" but allow itself to be turned off, and then
> mechanically change the small handful of "#ifdef CONFIG_DEBUG_KERNEL" to
> "#ifdef CONFIG_DEBUG_MISC".
> 
> Signed-off-by: Sinan Kaya 

Minor typo below; with that:
Reviewed-by: Josh Triplett 

And thank you!

> ---
>  lib/Kconfig.debug | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 0d9e81779e37..2f80ebaa6d9a 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -438,6 +438,15 @@ config DEBUG_KERNEL
> Say Y here if you are developing drivers or trying to debug and
> identify kernel problems.
>  
> +config DEBUG_MISC
> + bool "Miscellaneous debug code"
> + default DEBUG_KERNEL
> + depends on DEBUG_KERNEL
> + help
> +   Say Y here if you need to enable Miscellaneous debug code that should

Nit: s/Miscellaneous/miscellaneous/

> +   be under a more specific debug option but isn't.
> +
> +
>  menu "Memory Debugging"
>  
>  source "mm/Kconfig.debug"
> -- 
> 2.21.0
>

Re: [PATCH v2] init: Do not select DEBUG_KERNEL by default

2019-04-11 Thread Josh Triplett

On Thu, Apr 11, 2019 at 06:27:04PM -0400, Sinan Kaya wrote:
> On 4/11/2019 6:21 PM, Kees Cook wrote:
> > > Proposed alternative plan: let's add a new symbol, something like
> > > DEBUG_MISC ("Miscellaneous debug code that should be under a more
> > > specific debug option but isn't"), make it depend on DEBUG_KERNEL and be
> > > "default DEBUG_KERNEL" but allow itself to be turned off, and then
> > > mechanically change the small handful of "#ifdef CONFIG_DEBUG_KERNEL" to
> > > "#ifdef CONFIG_DEBUG_MISC".
> > > 
> > > Does that sound like an appropriately rapid solution for this bug?
> > Sure, that sounds fine to me. Sinan can you take care of that for v4?
> 
> Sure, let me work on this.

Thank you both! I really appreciate that.

Re: [PATCH v3] init: Do not select DEBUG_KERNEL by default

2019-04-11 Thread Josh Triplett

On Thu, Apr 11, 2019 at 10:00:24AM -0700, Kees Cook wrote:
> On Wed, Apr 10, 2019 at 10:34 PM Masahiro Yamada
>  wrote:
> >
> > On Thu, Apr 11, 2019 at 11:47 AM Kees Cook  wrote:
> > >
> > > On Wed, Apr 10, 2019 at 5:56 PM Sinan Kaya  wrote:
> > > >
> > > > We can't seem to have a kernel with CONFIG_EXPERT set but
> > > > CONFIG_DEBUG_KERNEL unset these days.
> > > >
> > > > While some of the features under the CONFIG_EXPERT require
> > > > CONFIG_DEBUG_KERNEL, it doesn't apply for all features.
> > > >
> > > > It looks like CONFIG_KALLSYMS_ALL is the only feature that
> > > > requires CONFIG_DEBUG_KERNEL.
> > > >
> > > > Select CONFIG_EXPERT when CONFIG_DEBUG_KERNEL is chosen but
> > > > you can still choose CONFIG_EXPERT without CONFIG_DEBUG_KERNEL.
> > > >
> > > > Signed-off-by: Sinan Kaya 
> > > > Reviewed-by: Kees Cook 
> > >
> > > Masahiro, should this go via your tree, or somewhere else?
> >
> >
> > I think somewhere else.
> 
> Okay. Andrew, can you take this?

Echoing my comment from the previous thread: this does not seem like the
right solution to me and it introduces a new problem. Please see my
response to v2 for a quick alternative that would address the underlying
bug, which is that some code generation depends on DEBUG_KERNEL.

Re: [PATCH v2] init: Do not select DEBUG_KERNEL by default

2019-04-11 Thread Josh Triplett

On Wed, Apr 10, 2019 at 11:13:52PM -0400, Sinan Kaya wrote:
> On 4/10/2019 11:02 PM, Josh Triplett wrote:
> > Then let's fix*that*, and get checkpatch to help enforce it in the future. 
> > EXPERT doesn't affect code generation, and neither should this.
> 
> I think we have to do both. We need to go after the users as well as
> solve the immediate problem per this patch.
> 
> As Mathieu identified, CONFIG_DEBUG_KERNEL is being used all over the
> place and getting subsystem owners to remove let alone add a check
> to checkpatch is just going to take time.
> 
> Please let us know if you are OK with this plan.

I'm not OK with this plan. Turning on EXPERT should make the options
under DEBUG_KERNEL visible; it's a bug that DEBUG_KERNEL affects code
generation as well.

Proposed alternative plan: let's add a new symbol, something like
DEBUG_MISC ("Miscellaneous debug code that should be under a more
specific debug option but isn't"), make it depend on DEBUG_KERNEL and be
"default DEBUG_KERNEL" but allow itself to be turned off, and then
mechanically change the small handful of "#ifdef CONFIG_DEBUG_KERNEL" to
"#ifdef CONFIG_DEBUG_MISC".

Does that sound like an appropriately rapid solution for this bug?

Re: [PATCH v2] init: Do not select DEBUG_KERNEL by default

2019-04-10 Thread Josh Triplett

On April 10, 2019 4:24:18 PM PDT, Kees Cook  wrote:
>On Wed, Apr 10, 2019 at 4:22 PM Josh Triplett 
>wrote:
>>
>> On April 10, 2019 3:58:55 PM PDT, Kees Cook 
>wrote:
>> >On Wed, Apr 10, 2019 at 3:42 PM Sinan Kaya  wrote:
>> >>
>> >> We can't seem to have a kernel with CONFIG_EXPERT set but
>> >> CONFIG_DEBUG_KERNEL unset these days.
>> >>
>> >> While some of the features under the CONFIG_EXPERT require
>> >> CONFIG_DEBUG_KERNEL, it doesn't apply for all features.
>> >>
>> >> It looks like CONFIG_KALLSYMS_ALL is the only feature that
>> >> requires CONFIG_DEBUG_KERNEL.
>> >>
>> >> Select CONFIG_EXPERT when CONFIG_DEBUG is chosen but you can
>> >
>> >Typo: CONFIG_DEBUG_KERNEL
>> >
>> >> still choose CONFIG_EXPERT without CONFIG_DEBUG.
>> >
>> >same.
>> >
>> >>
>> >> Signed-off-by: Sinan Kaya 
>> >
>> >But with those fixed, looks good to me. Adding Josh (and others) to
>CC
>> >since he originally added the linkage to EXPERT in commit
>> >f505c553dbe2.
>>
>> CONFIG_DEBUG_KERNEL shouldn't affect code generation in any way; it
>should only make more options appear in kconfig. I originally added
>this to ensure that features you might want to *disable* aren't hidden,
>as part of the tinification effort.
>>
>> What specific problem does having CONFIG_DEBUG_KERNEL enabled cause
>for you? I'd still prefer to have a single switch for "don't hide
>things I might want to disable", rather than several.
>
>See earlier in the thread: code generation depends on
>CONFIG_DEBUG_KERNEL now unfortunately.

Then let's fix *that*, and get checkpatch to help enforce it in the future. 
EXPERT doesn't affect code generation, and neither should this.

Re: [PATCH v2] init: Do not select DEBUG_KERNEL by default

2019-04-10 Thread Josh Triplett

On April 10, 2019 3:58:55 PM PDT, Kees Cook  wrote:
>On Wed, Apr 10, 2019 at 3:42 PM Sinan Kaya  wrote:
>>
>> We can't seem to have a kernel with CONFIG_EXPERT set but
>> CONFIG_DEBUG_KERNEL unset these days.
>>
>> While some of the features under the CONFIG_EXPERT require
>> CONFIG_DEBUG_KERNEL, it doesn't apply for all features.
>>
>> It looks like CONFIG_KALLSYMS_ALL is the only feature that
>> requires CONFIG_DEBUG_KERNEL.
>>
>> Select CONFIG_EXPERT when CONFIG_DEBUG is chosen but you can
>
>Typo: CONFIG_DEBUG_KERNEL
>
>> still choose CONFIG_EXPERT without CONFIG_DEBUG.
>
>same.
>
>>
>> Signed-off-by: Sinan Kaya 
>
>But with those fixed, looks good to me. Adding Josh (and others) to CC
>since he originally added the linkage to EXPERT in commit
>f505c553dbe2.

CONFIG_DEBUG_KERNEL shouldn't affect code generation in any way; it should only 
make more options appear in kconfig. I originally added this to ensure that 
features you might want to *disable* aren't hidden, as part of the tinification 
effort.

What specific problem does having CONFIG_DEBUG_KERNEL enabled cause for you? 
I'd still prefer to have a single switch for "don't hide things I might want to 
disable", rather than several.

This would also need checking to make sure it doesn't grow tinyconfig.

Re: [RFC PATCH v4 5/5] x86/vdso: Add __vdso_sgx_enter_enclave() to wrap SGX enclave transitions

2018-12-14 Thread Josh Triplett

On Fri, Dec 14, 2018 at 09:03:11AM -0800, Sean Christopherson wrote:
> On Fri, Dec 14, 2018 at 07:38:30AM -0800, Sean Christopherson wrote:
> > On Fri, Dec 14, 2018 at 07:12:04AM -0800, Sean Christopherson wrote:
> > > On Fri, Dec 14, 2018 at 09:55:49AM +, Jethro Beekman wrote:
> > > > On 2018-12-14 03:01, Sean Christopherson wrote:
> > > > >+2:pop %rbx
> > > > >+  pop %r12
> > > > >+  pop %r13
> > > > >+  pop %r14
> > > > >+  pop %r15
> > > > >+  pop %rbp
> > > > >+  ret
> > > > 
> > > > x86-64 ABI requires that you call CLD here (enclave may set it).
> > > 
> > > Ugh.  Technically MXCSR and the x87 CW also need to be preserved.
> > > 
> > > What if rather than treating the enclave as hostile we require it to be
> > > compliant with the x86-64 ABI like any other function?  That would solve
> > > the EFLAGS.DF, MXCSR and x87 issues without adding unnecessary overhead.
> > > And we wouldn't have to save/restore R12-R15.  It'd mean we couldn't use
> > > the stack's red zone to hold @regs and @e, but that's poor form anyways.
> > 
> > Grr, except the processor crushes R12-R15, FCW and MXCSR on asynchronous
> > exits.  But not EFLAGS.DF, that's real helpful.
> 
> I can think of three options that are at least somewhat reasonable:
> 
>   1) Save/restore MXCSR and FCW
> 
>  + 100% compliant with the x86-64 ABI
>  + Callable from any code
>  + Minimal documentation required
>  - Restoring MXCSR/FCW is likely unnecessary 99% of the time
>  - Slow
> 
>   2) Clear EFLAGS.DF but not save/restore MXCSR and FCW
> 
>  + Mostly compliant with the x86-64 ABI
>  + Callable from any code that doesn't use SIMD registers
>  - Need to document deviations from x86-64 ABI
> 
>   3) Require the caller to save/restore everything.
> 
>  + Fast
>  + Userspace can pass all GPRs to the enclave (minus EAX, RBX and RCX)
>  - Completely custom ABI
>  - For all intents and purposes must be called from an assembly wrapper
> 
> 
> Option (3) actually isn't all that awful.  RCX can be used to pass an
> optional pointer to a 'struct sgx_enclave_exception' and we can still
> return standard error codes, e.g. -EFAULT.

Entering and exiting a syscall requires an assembly wrapper, and that
doesn't seem completely unreasonable. It's an easy bit of inline
assembly.

Re: [RFC PATCH v3 0/4] x86: Add exception fixup for SGX ENCLU

2018-12-10 Thread Josh Triplett

On Mon, Dec 10, 2018 at 03:21:37PM -0800, Sean Christopherson wrote:
> At that point I realized it's a hell of a lot easier to simply provide
> an IOCTL via /dev/sgx that allows userspace to register a per-process
> ENCLU exception handler.  At a high level, the basic idea is the same
> as the vDSO approach: provide a hardcoded fixup handler for ENCLU and
> attempt to fixup select unhandled exceptions that occurred in user code.

So, on the one hand, this is *absolutely* much cleaner than the VDSO
approach. On the other hand, this is global process state and has some
of the same problems as a signal handler as a result.

Re: [PATCH v11 00/13] Intel SGX1 support

2018-12-10 Thread Josh Triplett

On Mon, Dec 10, 2018 at 09:27:04AM +0100, Pavel Machek wrote:
> On Sun 2018-12-09 23:47:17, Josh Triplett wrote:
> > On Sun, Dec 09, 2018 at 09:06:00PM +0100, Pavel Machek wrote:
> > ...
> > > > > > The default permissions for the device are 600.
> > > > > 
> > > > > Good. This does not belong to non-root.
> > > > 
> > > > There are entirely legitimate use cases for using this as an
> > > > unprivileged user. However, that'll be up to system and distribution
> > > > policy, which can evolve over time, and it makes sense for the *initial*
> > > > kernel permission to start out root-only and then adjust permissions via
> > > > udev.
> > > 
> > > Agreed.
> > > 
> > > > Building a software certificate store. Hardening key-agent software like
> > > > ssh-agent or gpg-agent. Building a challenge-response authentication
> > > > system. Providing more assurance that your server infrastructure is
> > > > uncompromised. Offloading computation to a system without having to
> > > > fully trust that system.
> > > 
> > > I think you can do the crypto stuff... as crypto already verifies the
> > > results. But I don't think you can do the computation offload.
> > 
> > You can, as long as you can do attestation.
> 
> You can not, because random errors are very easy to trigger for person
> with physical access,

Random errors can also just happen, so if you're concerned about that
you might want to build each object on two different machines and
compare. Good luck generating the *same* random errors on two machines.

(And, of course, someone can also DoS you in any number of other ways,
such as accepting data and then never sending back a result. So you'll
need timeouts and failovers.)

> > > > As one of many possibilities, imagine a distcc that didn't have to trust
> > > > the compile nodes. The compile nodes could fail to return results at
> > > > all, but they couldn't alter the results.
> > > 
> > > distcc on untrusted nodes ... oh yes, that would be great.
> > > 
> > > Except that you can't do it, right? :-).
> > > 
> > > First, AFAICT it would be quite hard to get gcc to run under SGX. But
> > > maybe you have spare month or three and can do it.
> > 
> > Assuming you don't need to #include files, gcc seems quite simple to run
> > in an enclave: data in, computation inside, data out.
> 
> So is there a plan to run dynamically linked binaries inside enclave?

I've seen some approaches for that, but you could also just statically
link your compiler. (Since you'd need attestation for all the individual
libraries, you'd need to know the versions of all those libraries, so
you might as well just statically link.)

> Or maybe even python/shell scripts? It looked to me like virtual
> memory will be "interesting" for enclaves.

Memory management doesn't seem that hard to deal with.

Re: [PATCH v11 00/13] Intel SGX1 support

2018-12-09 Thread Josh Triplett

On Sun, Dec 09, 2018 at 09:06:00PM +0100, Pavel Machek wrote:
...
> > > > The default permissions for the device are 600.
> > > 
> > > Good. This does not belong to non-root.
> > 
> > There are entirely legitimate use cases for using this as an
> > unprivileged user. However, that'll be up to system and distribution
> > policy, which can evolve over time, and it makes sense for the *initial*
> > kernel permission to start out root-only and then adjust permissions via
> > udev.
> 
> Agreed.
> 
> > Building a software certificate store. Hardening key-agent software like
> > ssh-agent or gpg-agent. Building a challenge-response authentication
> > system. Providing more assurance that your server infrastructure is
> > uncompromised. Offloading computation to a system without having to
> > fully trust that system.
> 
> I think you can do the crypto stuff... as crypto already verifies the
> results. But I don't think you can do the computation offload.

You can, as long as you can do attestation.

> > As one of many possibilities, imagine a distcc that didn't have to trust
> > the compile nodes. The compile nodes could fail to return results at
> > all, but they couldn't alter the results.
> 
> distcc on untrusted nodes ... oh yes, that would be great.
> 
> Except that you can't do it, right? :-).
> 
> First, AFAICT it would be quite hard to get gcc to run under SGX. But
> maybe you have spare month or three and can do it.

Assuming you don't need to #include files, gcc seems quite simple to run
in an enclave: data in, computation inside, data out.

Re: [tip:core/rcu] rcutorture: Make initrd/init execute in userspace

2018-12-05 Thread Josh Triplett

On Thu, Dec 06, 2018 at 01:51:47AM +0100, Andrea Parri wrote:
> > commit 4f8f751961b536f77c8f82394963e8e2d26efd84
> > Author: Paul E. McKenney 
> > Date:   Tue Dec 4 14:59:12 2018 -0800
> > 
> > torture: Explain and simplify odd "for" loop in mkinitrd.sh
> > 
> > Why a Bourne-shell "for" loop?  And why 192 instances of "a"?  This 
> > commit
> > adds a shell comment to present the answer to these mysteries.  It also
> > uses a series of factor-of-four Bourne-shell assignments to make it
> > easy to see how many instances there are, replacing the earlier wall of
> > 'a' characters.
> > 
> > Reported-by: Josh Triplett 
> > Signed-off-by: Paul E. McKenney 
> > 
> > diff --git a/tools/testing/selftests/rcutorture/bin/mkinitrd.sh 
> > b/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> > index da298394daa2..ff69190604ea 100755
> > --- a/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> > +++ b/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> > @@ -40,17 +40,24 @@ mkdir $T
> >  cat > $T/init << '__EOF___'
> >  #!/bin/sh
> >  # Run in userspace a few milliseconds every second.  This helps to
> > -# exercise the NO_HZ_FULL portions of RCU.
> > +# exercise the NO_HZ_FULL portions of RCU.  The 192 instances of "a" was
> > +# empirically shown to give a nice multi-millisecond burst of user-mode
> > +# execution on a 2GHz CPU, as desired.  Modern CPUs will vary from a
> > +# couple of milliseconds up to perhaps 100 milliseconds, which is an
> > +# acceptable range.
> > +#
> > +# Why not calibrate an exact delay?  Because within this initrd, we
> > +# are restricted to Bourne-shell builtins, which as far as I know do not
> > +# provide any means of obtaining a fine-grained timestamp.
> > +
> > +a4="a a a a"
> > +a16="$a4 $a4 $a4 $a4"
> > +a64="$a8 $a8 $a8 $a8"
> 
> Mmh, are you sure you don't want s/a8/a16/ here? ;-)

... *facepalm*

Good catch.

Re: [tip:core/rcu] rcutorture: Make initrd/init execute in userspace

2018-12-05 Thread Josh Triplett

On Thu, Dec 06, 2018 at 01:51:47AM +0100, Andrea Parri wrote:
> > commit 4f8f751961b536f77c8f82394963e8e2d26efd84
> > Author: Paul E. McKenney 
> > Date:   Tue Dec 4 14:59:12 2018 -0800
> > 
> > torture: Explain and simplify odd "for" loop in mkinitrd.sh
> > 
> > Why a Bourne-shell "for" loop?  And why 192 instances of "a"?  This 
> > commit
> > adds a shell comment to present the answer to these mysteries.  It also
> > uses a series of factor-of-four Bourne-shell assignments to make it
> > easy to see how many instances there are, replacing the earlier wall of
> > 'a' characters.
> > 
> > Reported-by: Josh Triplett 
> > Signed-off-by: Paul E. McKenney 
> > 
> > diff --git a/tools/testing/selftests/rcutorture/bin/mkinitrd.sh 
> > b/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> > index da298394daa2..ff69190604ea 100755
> > --- a/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> > +++ b/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> > @@ -40,17 +40,24 @@ mkdir $T
> >  cat > $T/init << '__EOF___'
> >  #!/bin/sh
> >  # Run in userspace a few milliseconds every second.  This helps to
> > -# exercise the NO_HZ_FULL portions of RCU.
> > +# exercise the NO_HZ_FULL portions of RCU.  The 192 instances of "a" was
> > +# empirically shown to give a nice multi-millisecond burst of user-mode
> > +# execution on a 2GHz CPU, as desired.  Modern CPUs will vary from a
> > +# couple of milliseconds up to perhaps 100 milliseconds, which is an
> > +# acceptable range.
> > +#
> > +# Why not calibrate an exact delay?  Because within this initrd, we
> > +# are restricted to Bourne-shell builtins, which as far as I know do not
> > +# provide any means of obtaining a fine-grained timestamp.
> > +
> > +a4="a a a a"
> > +a16="$a4 $a4 $a4 $a4"
> > +a64="$a8 $a8 $a8 $a8"
> 
> Mmh, are you sure you don't want s/a8/a16/ here? ;-)

... *facepalm*

Good catch.

Re: [tip:core/rcu] rcutorture: Make initrd/init execute in userspace

2018-12-05 Thread Josh Triplett

On Wed, Dec 05, 2018 at 04:08:09PM -0800, Paul E. McKenney wrote:
> On Wed, Dec 05, 2018 at 02:25:24PM -0800, Josh Triplett wrote:
> > On Tue, Dec 04, 2018 at 03:04:23PM -0800, Paul E. McKenney wrote:
> > > On Tue, Dec 04, 2018 at 02:24:13PM -0800, Josh Triplett wrote:
> > > > On Tue, Dec 04, 2018 at 02:09:42PM -0800, tip-bot for Paul E. McKenney 
> > > > wrote:
> > > > > --- a/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> > > > > +++ b/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> > > > > @@ -39,9 +39,22 @@ mkdir $T
> > > > >  
> > > > >  cat > $T/init << '__EOF___'
> > > > >  #!/bin/sh
> > > > > +# Run in userspace a few milliseconds every second.  This helps to
> > > > > +# exercise the NO_HZ_FULL portions of RCU.
> > > > >  while :
> > > > >  do
> > > > > - sleep 100
> > > > > + q=
> > > > > + for i in \
> > > > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a 
> > > > > a a a \
> > > > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a 
> > > > > a a a \
> > > > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a 
> > > > > a a a \
> > > > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a 
> > > > > a a a \
> > > > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a 
> > > > > a a a \
> > > > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a 
> > > > > a a a
> > > > 
> > > > Ow. If there's no better way to do this, please do at least comment how 
> > > > many 'a's
> > > > this is. (And why 186, exactly?)
> > > 
> > > Yeah, that is admittedly a bit strange.  The reason for 186 occurrences of
> > > "a" to one-time calibration, measuring a few millisecond's worth of delay.
> > > 
> > > > Please also consider calibrating the delay loop as you do in the C code.
> > > 
> > > Good point.  And a quick web search finds me "date '+%s%N'", which gives
> > > me nanoseconds since the epoch.  I probably don't want to do a 2038 to
> > > myself (after all, I might still be alive then), so I should probably try
> > > to make something work with "date '+%N'".  Or use something like this:
> > > 
> > >   $ date '+%4N'; date '+%4N';date '+%4N'; date '+%4N'
> > >   6660
> > >   6685
> > >   6697
> > >   6710
> > > 
> > > Ah, but that means I need to add the "date" command to my initrd, doesn't
> > > it?  And calculation requires either bash or the "test" command.  And it
> > > would be quite good to restrict this to what can be done with Bourne shell
> > > built-in commands, since a big point of this is to maintain a small-sized
> > > initrd.  :-/
> > 
> > Sure, and I'm not suggesting adding commands to the initrd, hence my
> > mention of "If there's no better way".
> > 
> > > So how about the following patch, which attempts to explain the situation?
> > 
> > That would help, but please also consider consolidating with something
> > like a10="a a a a a a a a a a" to make it more readable (and perhaps
> > rounding up to 200 for simplicity).
> 
> How about powers of four and one factor of three for 192, as shown below?

Perfect, thanks. That's much better.

Reviewed-by: Josh Triplett 

>   Thanx, Paul
> 
> 
> 
> commit 4f8f751961b536f77c8f82394963e8e2d26efd84
> Author: Paul E. McKenney 
> Date:   Tue Dec 4 14:59:12 2018 -0800
> 
> torture: Explain and simplify odd "for" loop in mkinitrd.sh
> 
> Why a Bourne-shell "for" loop?  And why 192 instances of "a"?  This commit
> adds a shell comment to present the answer to these mysteries.  It also
> uses a series of factor-of-four Bourne-shell assignments to make it
> easy to see how many instances there are, replacing the earlier wall of
> 'a' characters.
> 
> Reported-by: Josh Triplett 
> Signed-off-by: Paul E. McKenney 
> 
> diff --git a/tools/testing/selftests/rcutorture/bin/mkinitrd.sh 
> b/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> index da298394daa2..ff691906

Re: [tip:core/rcu] rcutorture: Make initrd/init execute in userspace

2018-12-05 Thread Josh Triplett

On Wed, Dec 05, 2018 at 04:08:09PM -0800, Paul E. McKenney wrote:
> On Wed, Dec 05, 2018 at 02:25:24PM -0800, Josh Triplett wrote:
> > On Tue, Dec 04, 2018 at 03:04:23PM -0800, Paul E. McKenney wrote:
> > > On Tue, Dec 04, 2018 at 02:24:13PM -0800, Josh Triplett wrote:
> > > > On Tue, Dec 04, 2018 at 02:09:42PM -0800, tip-bot for Paul E. McKenney 
> > > > wrote:
> > > > > --- a/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> > > > > +++ b/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> > > > > @@ -39,9 +39,22 @@ mkdir $T
> > > > >  
> > > > >  cat > $T/init << '__EOF___'
> > > > >  #!/bin/sh
> > > > > +# Run in userspace a few milliseconds every second.  This helps to
> > > > > +# exercise the NO_HZ_FULL portions of RCU.
> > > > >  while :
> > > > >  do
> > > > > - sleep 100
> > > > > + q=
> > > > > + for i in \
> > > > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a 
> > > > > a a a \
> > > > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a 
> > > > > a a a \
> > > > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a 
> > > > > a a a \
> > > > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a 
> > > > > a a a \
> > > > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a 
> > > > > a a a \
> > > > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a 
> > > > > a a a
> > > > 
> > > > Ow. If there's no better way to do this, please do at least comment how 
> > > > many 'a's
> > > > this is. (And why 186, exactly?)
> > > 
> > > Yeah, that is admittedly a bit strange.  The reason for 186 occurrences of
> > > "a" to one-time calibration, measuring a few millisecond's worth of delay.
> > > 
> > > > Please also consider calibrating the delay loop as you do in the C code.
> > > 
> > > Good point.  And a quick web search finds me "date '+%s%N'", which gives
> > > me nanoseconds since the epoch.  I probably don't want to do a 2038 to
> > > myself (after all, I might still be alive then), so I should probably try
> > > to make something work with "date '+%N'".  Or use something like this:
> > > 
> > >   $ date '+%4N'; date '+%4N';date '+%4N'; date '+%4N'
> > >   6660
> > >   6685
> > >   6697
> > >   6710
> > > 
> > > Ah, but that means I need to add the "date" command to my initrd, doesn't
> > > it?  And calculation requires either bash or the "test" command.  And it
> > > would be quite good to restrict this to what can be done with Bourne shell
> > > built-in commands, since a big point of this is to maintain a small-sized
> > > initrd.  :-/
> > 
> > Sure, and I'm not suggesting adding commands to the initrd, hence my
> > mention of "If there's no better way".
> > 
> > > So how about the following patch, which attempts to explain the situation?
> > 
> > That would help, but please also consider consolidating with something
> > like a10="a a a a a a a a a a" to make it more readable (and perhaps
> > rounding up to 200 for simplicity).
> 
> How about powers of four and one factor of three for 192, as shown below?

Perfect, thanks. That's much better.

Reviewed-by: Josh Triplett 

>   Thanx, Paul
> 
> 
> 
> commit 4f8f751961b536f77c8f82394963e8e2d26efd84
> Author: Paul E. McKenney 
> Date:   Tue Dec 4 14:59:12 2018 -0800
> 
> torture: Explain and simplify odd "for" loop in mkinitrd.sh
> 
> Why a Bourne-shell "for" loop?  And why 192 instances of "a"?  This commit
> adds a shell comment to present the answer to these mysteries.  It also
> uses a series of factor-of-four Bourne-shell assignments to make it
> easy to see how many instances there are, replacing the earlier wall of
> 'a' characters.
> 
> Reported-by: Josh Triplett 
> Signed-off-by: Paul E. McKenney 
> 
> diff --git a/tools/testing/selftests/rcutorture/bin/mkinitrd.sh 
> b/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> index da298394daa2..ff691906

Re: [tip:core/rcu] rcutorture: Make initrd/init execute in userspace

2018-12-05 Thread Josh Triplett

On Tue, Dec 04, 2018 at 03:04:23PM -0800, Paul E. McKenney wrote:
> On Tue, Dec 04, 2018 at 02:24:13PM -0800, Josh Triplett wrote:
> > On Tue, Dec 04, 2018 at 02:09:42PM -0800, tip-bot for Paul E. McKenney 
> > wrote:
> > > --- a/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> > > +++ b/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> > > @@ -39,9 +39,22 @@ mkdir $T
> > >  
> > >  cat > $T/init << '__EOF___'
> > >  #!/bin/sh
> > > +# Run in userspace a few milliseconds every second.  This helps to
> > > +# exercise the NO_HZ_FULL portions of RCU.
> > >  while :
> > >  do
> > > - sleep 100
> > > + q=
> > > + for i in \
> > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
> > 
> > Ow. If there's no better way to do this, please do at least comment how 
> > many 'a's
> > this is. (And why 186, exactly?)
> 
> Yeah, that is admittedly a bit strange.  The reason for 186 occurrences of
> "a" to one-time calibration, measuring a few millisecond's worth of delay.
> 
> > Please also consider calibrating the delay loop as you do in the C code.
> 
> Good point.  And a quick web search finds me "date '+%s%N'", which gives
> me nanoseconds since the epoch.  I probably don't want to do a 2038 to
> myself (after all, I might still be alive then), so I should probably try
> to make something work with "date '+%N'".  Or use something like this:
> 
>   $ date '+%4N'; date '+%4N';date '+%4N'; date '+%4N'
>   6660
>   6685
>   6697
>   6710
> 
> Ah, but that means I need to add the "date" command to my initrd, doesn't
> it?  And calculation requires either bash or the "test" command.  And it
> would be quite good to restrict this to what can be done with Bourne shell
> built-in commands, since a big point of this is to maintain a small-sized
> initrd.  :-/

Sure, and I'm not suggesting adding commands to the initrd, hence my
mention of "If there's no better way".

> So how about the following patch, which attempts to explain the situation?

That would help, but please also consider consolidating with something
like a10="a a a a a a a a a a" to make it more readable (and perhaps
rounding up to 200 for simplicity).

- Josh

Re: [tip:core/rcu] rcutorture: Make initrd/init execute in userspace

2018-12-05 Thread Josh Triplett

On Tue, Dec 04, 2018 at 03:04:23PM -0800, Paul E. McKenney wrote:
> On Tue, Dec 04, 2018 at 02:24:13PM -0800, Josh Triplett wrote:
> > On Tue, Dec 04, 2018 at 02:09:42PM -0800, tip-bot for Paul E. McKenney 
> > wrote:
> > > --- a/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> > > +++ b/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> > > @@ -39,9 +39,22 @@ mkdir $T
> > >  
> > >  cat > $T/init << '__EOF___'
> > >  #!/bin/sh
> > > +# Run in userspace a few milliseconds every second.  This helps to
> > > +# exercise the NO_HZ_FULL portions of RCU.
> > >  while :
> > >  do
> > > - sleep 100
> > > + q=
> > > + for i in \
> > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> > > + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
> > 
> > Ow. If there's no better way to do this, please do at least comment how 
> > many 'a's
> > this is. (And why 186, exactly?)
> 
> Yeah, that is admittedly a bit strange.  The reason for 186 occurrences of
> "a" to one-time calibration, measuring a few millisecond's worth of delay.
> 
> > Please also consider calibrating the delay loop as you do in the C code.
> 
> Good point.  And a quick web search finds me "date '+%s%N'", which gives
> me nanoseconds since the epoch.  I probably don't want to do a 2038 to
> myself (after all, I might still be alive then), so I should probably try
> to make something work with "date '+%N'".  Or use something like this:
> 
>   $ date '+%4N'; date '+%4N';date '+%4N'; date '+%4N'
>   6660
>   6685
>   6697
>   6710
> 
> Ah, but that means I need to add the "date" command to my initrd, doesn't
> it?  And calculation requires either bash or the "test" command.  And it
> would be quite good to restrict this to what can be done with Bourne shell
> built-in commands, since a big point of this is to maintain a small-sized
> initrd.  :-/

Sure, and I'm not suggesting adding commands to the initrd, hence my
mention of "If there's no better way".

> So how about the following patch, which attempts to explain the situation?

That would help, but please also consider consolidating with something
like a10="a a a a a a a a a a" to make it more readable (and perhaps
rounding up to 200 for simplicity).

- Josh

Re: [tip:core/rcu] rcutorture: Make initrd/init execute in userspace

2018-12-04 Thread Josh Triplett

On Tue, Dec 04, 2018 at 02:09:42PM -0800, tip-bot for Paul E. McKenney wrote:
> --- a/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> +++ b/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> @@ -39,9 +39,22 @@ mkdir $T
>  
>  cat > $T/init << '__EOF___'
>  #!/bin/sh
> +# Run in userspace a few milliseconds every second.  This helps to
> +# exercise the NO_HZ_FULL portions of RCU.
>  while :
>  do
> - sleep 100
> + q=
> + for i in \
> + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a

Ow. If there's no better way to do this, please do at least comment how many 
'a's
this is. (And why 186, exactly?)

Please also consider calibrating the delay loop as you do in the C code.

Re: [tip:core/rcu] rcutorture: Make initrd/init execute in userspace

2018-12-04 Thread Josh Triplett

On Tue, Dec 04, 2018 at 02:09:42PM -0800, tip-bot for Paul E. McKenney wrote:
> --- a/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> +++ b/tools/testing/selftests/rcutorture/bin/mkinitrd.sh
> @@ -39,9 +39,22 @@ mkdir $T
>  
>  cat > $T/init << '__EOF___'
>  #!/bin/sh
> +# Run in userspace a few milliseconds every second.  This helps to
> +# exercise the NO_HZ_FULL portions of RCU.
>  while :
>  do
> - sleep 100
> + q=
> + for i in \
> + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a \
> + a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a

Ow. If there's no better way to do this, please do at least comment how many 
'a's
this is. (And why 186, exactly?)

Please also consider calibrating the delay loop as you do in the C code.

Re: [Ksummit-discuss] Call to Action Re: [PATCH 0/7] Code of Conduct: Fix some wording, and add an interpretation document

2018-11-01 Thread Josh Triplett

On Thu, Nov 01, 2018 at 09:45:44AM -0700, Paul E. McKenney wrote:
> On Sat, Oct 27, 2018 at 02:10:10AM +0100, Josh Triplett wrote:
> > Not when that document started out effectively saying, in an elaborate
> > way, "code > people".
> 
> Interesting.
> 
> I am curious what leads you to your "code > people" statement.  Of course,
> one could argue that this does not really matter given that the code of
> conflict is no longer.  However, I would like to understand for future
> reference, if for no other reason.
> 
> One possibility is that you are restricting the "people" to only those
> people directly contributing in one way or another.  But those using the
> kernel (both directly and indirectly) are important as well, and it is
> exactly this group that is served by "the most robust operating system
> kernel ever", the chest-beating sentiment notwithstanding.  Which is in
> fact why I must reject (or rework or whatever) any patch that might result
> in too-short RCU grace periods:  The needs of the patch's submitter are
> quite emphatically outweighed by the needs of the kernel's many users,
> and many of the various technical requirements and restrictions are in
> fact proxies for the needs of these users.

As discussed in many other places as well, nobody is suggesting at all
that the standards for accepting code should change. Reject the patches
you would have rejected, accept the patches you would have accepted. All
of this affects *communication*.

- Josh Triplett

Re: [Ksummit-discuss] Call to Action Re: [PATCH 0/7] Code of Conduct: Fix some wording, and add an interpretation document

2018-11-01 Thread Josh Triplett

On Thu, Nov 01, 2018 at 09:45:44AM -0700, Paul E. McKenney wrote:
> On Sat, Oct 27, 2018 at 02:10:10AM +0100, Josh Triplett wrote:
> > Not when that document started out effectively saying, in an elaborate
> > way, "code > people".
> 
> Interesting.
> 
> I am curious what leads you to your "code > people" statement.  Of course,
> one could argue that this does not really matter given that the code of
> conflict is no longer.  However, I would like to understand for future
> reference, if for no other reason.
> 
> One possibility is that you are restricting the "people" to only those
> people directly contributing in one way or another.  But those using the
> kernel (both directly and indirectly) are important as well, and it is
> exactly this group that is served by "the most robust operating system
> kernel ever", the chest-beating sentiment notwithstanding.  Which is in
> fact why I must reject (or rework or whatever) any patch that might result
> in too-short RCU grace periods:  The needs of the patch's submitter are
> quite emphatically outweighed by the needs of the kernel's many users,
> and many of the various technical requirements and restrictions are in
> fact proxies for the needs of these users.

As discussed in many other places as well, nobody is suggesting at all
that the standards for accepting code should change. Reject the patches
you would have rejected, accept the patches you would have accepted. All
of this affects *communication*.

- Josh Triplett

Re: [Ksummit-discuss] Call to Action Re: [PATCH 0/7] Code of Conduct: Fix some wording, and add an interpretation document

2018-10-26 Thread Josh Triplett

On Fri, Oct 26, 2018 at 08:14:51AM +1100, NeilBrown wrote:
> On Wed, Oct 24 2018, Josh Triplett wrote:
> 
> > On Tue, Oct 23, 2018 at 07:26:06AM +1100, NeilBrown wrote:
> >> On Sun, Oct 21 2018, Josh Triplett wrote:
> >> 
> >> > On Mon, Oct 22, 2018 at 08:20:11AM +1100, NeilBrown wrote:
> >> >> I call on you, Greg:
> >> >>  - to abandon this divisive attempt to impose a "Code of Conduct"
> >> >>  - to revert 8a104f8b5867c68
> >> >>  - to return to your core competence of building a great team around
> >> >>a great kernel
> >> >> 
> >> >>  #Isupportreversion
> >> >> 
> >> >> I call on the community to consider what *does* need to be said, about
> >> >> conduct, to people outside the community and who have recently joined.
> >> >> What is the document that you would have liked to have read as you were
> >> >> starting out?  It is all too long ago for me to remember clearly, and so
> >> >> much has changed.
> >> >
> >> > The document I would have liked to have read when starting out is
> >> > currently checked into the source tree in
> >> > Documentation/process/code-of-conduct.rst .
> >> 
> >> I'm curious - what would you have gained by reading that document?
> >
> > I would have then had rather less of a pervasive feeling of "if I make
> > even a single mistake I get made an example of in ways that will feed
> > people's quotes files for years to come".
> 
> Thanks for your reply.  Certainly feeling safe is important, and having
> clear statements that the community values and promotes psychological
> safety is valuable.
> 
> The old "code of conflict" said
>If however, anyone feels personally abused, threatened, or otherwise
>uncomfortable due to this process, that is not acceptable. 
> 
> would you have not found this a strong enough statement to ward off that
> pervasive feeling?

Not when that document started out effectively saying, in an elaborate
way, "code > people". (Leaving aside that the more important detail
would be the community actually acting consistently with the code of
conduct it espoused.)

> In the current code, would The "Our Pledge" section have been
> sufficient, or do you think the other sections would have actually
> helped you?

"Our Standards" would have been at least as important to me personally,
as would "Enforcement" (and more importantly, examples of that applying
in practice and not just as empty words).

Re: [Ksummit-discuss] Call to Action Re: [PATCH 0/7] Code of Conduct: Fix some wording, and add an interpretation document

2018-10-26 Thread Josh Triplett

On Fri, Oct 26, 2018 at 08:14:51AM +1100, NeilBrown wrote:
> On Wed, Oct 24 2018, Josh Triplett wrote:
> 
> > On Tue, Oct 23, 2018 at 07:26:06AM +1100, NeilBrown wrote:
> >> On Sun, Oct 21 2018, Josh Triplett wrote:
> >> 
> >> > On Mon, Oct 22, 2018 at 08:20:11AM +1100, NeilBrown wrote:
> >> >> I call on you, Greg:
> >> >>  - to abandon this divisive attempt to impose a "Code of Conduct"
> >> >>  - to revert 8a104f8b5867c68
> >> >>  - to return to your core competence of building a great team around
> >> >>a great kernel
> >> >> 
> >> >>  #Isupportreversion
> >> >> 
> >> >> I call on the community to consider what *does* need to be said, about
> >> >> conduct, to people outside the community and who have recently joined.
> >> >> What is the document that you would have liked to have read as you were
> >> >> starting out?  It is all too long ago for me to remember clearly, and so
> >> >> much has changed.
> >> >
> >> > The document I would have liked to have read when starting out is
> >> > currently checked into the source tree in
> >> > Documentation/process/code-of-conduct.rst .
> >> 
> >> I'm curious - what would you have gained by reading that document?
> >
> > I would have then had rather less of a pervasive feeling of "if I make
> > even a single mistake I get made an example of in ways that will feed
> > people's quotes files for years to come".
> 
> Thanks for your reply.  Certainly feeling safe is important, and having
> clear statements that the community values and promotes psychological
> safety is valuable.
> 
> The old "code of conflict" said
>If however, anyone feels personally abused, threatened, or otherwise
>uncomfortable due to this process, that is not acceptable. 
> 
> would you have not found this a strong enough statement to ward off that
> pervasive feeling?

Not when that document started out effectively saying, in an elaborate
way, "code > people". (Leaving aside that the more important detail
would be the community actually acting consistently with the code of
conduct it espoused.)

> In the current code, would The "Our Pledge" section have been
> sufficient, or do you think the other sections would have actually
> helped you?

"Our Standards" would have been at least as important to me personally,
as would "Enforcement" (and more importantly, examples of that applying
in practice and not just as empty words).

Re: [Ksummit-discuss] Call to Action Re: [PATCH 0/7] Code of Conduct: Fix some wording, and add an interpretation document

2018-10-24 Thread Josh Triplett

On Tue, Oct 23, 2018 at 07:26:06AM +1100, NeilBrown wrote:
> On Sun, Oct 21 2018, Josh Triplett wrote:
> 
> > On Mon, Oct 22, 2018 at 08:20:11AM +1100, NeilBrown wrote:
> >> I call on you, Greg:
> >>  - to abandon this divisive attempt to impose a "Code of Conduct"
> >>  - to revert 8a104f8b5867c68
> >>  - to return to your core competence of building a great team around
> >>a great kernel
> >> 
> >>  #Isupportreversion
> >> 
> >> I call on the community to consider what *does* need to be said, about
> >> conduct, to people outside the community and who have recently joined.
> >> What is the document that you would have liked to have read as you were
> >> starting out?  It is all too long ago for me to remember clearly, and so
> >> much has changed.
> >
> > The document I would have liked to have read when starting out is
> > currently checked into the source tree in
> > Documentation/process/code-of-conduct.rst .
> 
> I'm curious - what would you have gained by reading that document?

I would have then had rather less of a pervasive feeling of "if I make
even a single mistake I get made an example of in ways that will feed
people's quotes files for years to come".

See
https://hbr.org/2017/08/high-performing-teams-need-psychological-safety-heres-how-to-create-it
for more on the benefits of that.

Re: [Ksummit-discuss] Call to Action Re: [PATCH 0/7] Code of Conduct: Fix some wording, and add an interpretation document

2018-10-24 Thread Josh Triplett

On Tue, Oct 23, 2018 at 07:26:06AM +1100, NeilBrown wrote:
> On Sun, Oct 21 2018, Josh Triplett wrote:
> 
> > On Mon, Oct 22, 2018 at 08:20:11AM +1100, NeilBrown wrote:
> >> I call on you, Greg:
> >>  - to abandon this divisive attempt to impose a "Code of Conduct"
> >>  - to revert 8a104f8b5867c68
> >>  - to return to your core competence of building a great team around
> >>a great kernel
> >> 
> >>  #Isupportreversion
> >> 
> >> I call on the community to consider what *does* need to be said, about
> >> conduct, to people outside the community and who have recently joined.
> >> What is the document that you would have liked to have read as you were
> >> starting out?  It is all too long ago for me to remember clearly, and so
> >> much has changed.
> >
> > The document I would have liked to have read when starting out is
> > currently checked into the source tree in
> > Documentation/process/code-of-conduct.rst .
> 
> I'm curious - what would you have gained by reading that document?

I would have then had rather less of a pervasive feeling of "if I make
even a single mistake I get made an example of in ways that will feed
people's quotes files for years to come".

See
https://hbr.org/2017/08/high-performing-teams-need-psychological-safety-heres-how-to-create-it
for more on the benefits of that.

Re: [Ksummit-discuss] [GIT PULL] code of conduct fixes for 4.19-rc8

2018-10-23 Thread Josh Triplett

On Mon, Oct 22, 2018 at 09:16:20PM -0700, Joe Perches wrote:
> On Mon, 2018-10-22 at 22:10 +0100, Greg Kroah-Hartman wrote:
> > On Sat, Oct 20, 2018 at 01:15:14PM -0700, James Bottomley wrote:
> > > This is the series of patches which has been discussed on both ksummit-
> > > discuss and linux-kernel for the past few weeks.  As Shuah said when
> > > kicking off the process, it's designed as a starting point for the next
> > > phase of the discussion, not as the end point, so it's only really a
> > > set of minor updates to further that goal.
> > > 
> > > The merger of the three patches to show the combined effect is attached
> > > below.  However, Greg recently posted the next phase of the discussion,
> > > so people will be asking what the merger of the series looks like. 
> > > Ignoring the non-CoC documents, I think it looks like this
> > 
> > Sorry for not responding sooner for this, travel and the meeting today
> > took up my time.
> > 
> > Anyway, as we discussed today in the Maintainers summit, let's leave the
> > Code of Conduct text alone for now.  It matches what "upstream" has with
> > the exception of removing that one paragraph.  If you have issues with
> > the wording in it, please work with upstream to fix the issues there as
> > hundreds of other projects will benefit with your changes if they are
> > really needed.
> 
> Given the different development models, that's not
> a very compelling argument.
> 
> As James Bottomley has suggested multiple times,
> I'd much rather kernel development use the debian
> code of conduct verbatim than even this modified one.
> 
> https://www.debian.org/code_of_conduct

The Debian code of conduct doesn't do nearly as good a job of addressing
issues. (Debian also adopted that code of conduct back when such codes
weren't nearly as well understood or established.) Many people *in*
Debian, including supporters of their current CoC, have an interest in
improving it further and/or adopting a more well-established one.

Re: [Ksummit-discuss] [GIT PULL] code of conduct fixes for 4.19-rc8

2018-10-23 Thread Josh Triplett

On Mon, Oct 22, 2018 at 09:16:20PM -0700, Joe Perches wrote:
> On Mon, 2018-10-22 at 22:10 +0100, Greg Kroah-Hartman wrote:
> > On Sat, Oct 20, 2018 at 01:15:14PM -0700, James Bottomley wrote:
> > > This is the series of patches which has been discussed on both ksummit-
> > > discuss and linux-kernel for the past few weeks.  As Shuah said when
> > > kicking off the process, it's designed as a starting point for the next
> > > phase of the discussion, not as the end point, so it's only really a
> > > set of minor updates to further that goal.
> > > 
> > > The merger of the three patches to show the combined effect is attached
> > > below.  However, Greg recently posted the next phase of the discussion,
> > > so people will be asking what the merger of the series looks like. 
> > > Ignoring the non-CoC documents, I think it looks like this
> > 
> > Sorry for not responding sooner for this, travel and the meeting today
> > took up my time.
> > 
> > Anyway, as we discussed today in the Maintainers summit, let's leave the
> > Code of Conduct text alone for now.  It matches what "upstream" has with
> > the exception of removing that one paragraph.  If you have issues with
> > the wording in it, please work with upstream to fix the issues there as
> > hundreds of other projects will benefit with your changes if they are
> > really needed.
> 
> Given the different development models, that's not
> a very compelling argument.
> 
> As James Bottomley has suggested multiple times,
> I'd much rather kernel development use the debian
> code of conduct verbatim than even this modified one.
> 
> https://www.debian.org/code_of_conduct

The Debian code of conduct doesn't do nearly as good a job of addressing
issues. (Debian also adopted that code of conduct back when such codes
weren't nearly as well understood or established.) Many people *in*
Debian, including supporters of their current CoC, have an interest in
improving it further and/or adopting a more well-established one.

Re: [Ksummit-discuss] Call to Action Re: [PATCH 0/7] Code of Conduct: Fix some wording, and add an interpretation document

2018-10-21 Thread Josh Triplett

On Mon, Oct 22, 2018 at 08:20:11AM +1100, NeilBrown wrote:
> I call on you, Greg:
>  - to abandon this divisive attempt to impose a "Code of Conduct"
>  - to revert 8a104f8b5867c68
>  - to return to your core competence of building a great team around
>a great kernel
> 
>  #Isupportreversion
> 
> I call on the community to consider what *does* need to be said, about
> conduct, to people outside the community and who have recently joined.
> What is the document that you would have liked to have read as you were
> starting out?  It is all too long ago for me to remember clearly, and so
> much has changed.

The document I would have liked to have read when starting out is
currently checked into the source tree in
Documentation/process/code-of-conduct.rst .

What is your actual concern? Why does it matter so much to you to push
back against this code of conduct? What does it say that you so
fundamentally object to?

(I personally *do* want to see most of the patch series that started
this particular thread dropped, because half of it undermines the point
of the document. The original commit, however, is a matter of
celebration.)

Re: [Ksummit-discuss] Call to Action Re: [PATCH 0/7] Code of Conduct: Fix some wording, and add an interpretation document

2018-10-21 Thread Josh Triplett

On Mon, Oct 22, 2018 at 08:20:11AM +1100, NeilBrown wrote:
> I call on you, Greg:
>  - to abandon this divisive attempt to impose a "Code of Conduct"
>  - to revert 8a104f8b5867c68
>  - to return to your core competence of building a great team around
>a great kernel
> 
>  #Isupportreversion
> 
> I call on the community to consider what *does* need to be said, about
> conduct, to people outside the community and who have recently joined.
> What is the document that you would have liked to have read as you were
> starting out?  It is all too long ago for me to remember clearly, and so
> much has changed.

The document I would have liked to have read when starting out is
currently checked into the source tree in
Documentation/process/code-of-conduct.rst .

What is your actual concern? Why does it matter so much to you to push
back against this code of conduct? What does it say that you so
fundamentally object to?

(I personally *do* want to see most of the patch series that started
this particular thread dropped, because half of it undermines the point
of the document. The original commit, however, is a matter of
celebration.)

Re: [Ksummit-discuss] [PATCH] code-of-conduct: Remove explicit list of discrimination factors

2018-10-17 Thread Josh Triplett

On Wed, Oct 17, 2018 at 08:49:15AM -0700, James Bottomley wrote:
> On Wed, 2018-10-17 at 08:21 -0700, Josh Triplett wrote:
> > People in underrepresented and commonly marginalized groups,
> > especially those more commonly overlooked, don't always know if a
> > given group has taken their particular group into account or given
> > any thought to it. Explicit inclusion helps, and this is a standard
> > guideline often cited for good codes of conduct.
> 
> Actually, that's not a good thing to do in a vacuum: you have to be
> really careful about how you do this from a legal point of view.  The
> argument over whether enumerating specific rights or classes disparages
> others has been going on for centuries.  To give you an example of how
> far back it goes: it's the reason for the ninth amendment to the US
> constitution.
> 
> The commonly accepted legal way of doing this today is the statement
> 
> "examples of X include but are not limited to: ..."
> 
> which is thought to work in most jurisdictions and is what you'll find
> in all US corporate codes of conduct or ethics.

Which is a much better proposal than removing the list entirely.

Re: [Ksummit-discuss] [PATCH] code-of-conduct: Remove explicit list of discrimination factors

2018-10-17 Thread Josh Triplett

On Wed, Oct 17, 2018 at 08:49:15AM -0700, James Bottomley wrote:
> On Wed, 2018-10-17 at 08:21 -0700, Josh Triplett wrote:
> > People in underrepresented and commonly marginalized groups,
> > especially those more commonly overlooked, don't always know if a
> > given group has taken their particular group into account or given
> > any thought to it. Explicit inclusion helps, and this is a standard
> > guideline often cited for good codes of conduct.
> 
> Actually, that's not a good thing to do in a vacuum: you have to be
> really careful about how you do this from a legal point of view.  The
> argument over whether enumerating specific rights or classes disparages
> others has been going on for centuries.  To give you an example of how
> far back it goes: it's the reason for the ninth amendment to the US
> constitution.
> 
> The commonly accepted legal way of doing this today is the statement
> 
> "examples of X include but are not limited to: ..."
> 
> which is thought to work in most jurisdictions and is what you'll find
> in all US corporate codes of conduct or ethics.

Which is a much better proposal than removing the list entirely.

Re: [Ksummit-discuss] [PATCH] code-of-conduct: Remove explicit list of discrimination factors

2018-10-17 Thread Josh Triplett

On Wed, Oct 17, 2018 at 06:32:36AM -0700, Guenter Roeck wrote:
> One could consider adding something like "discrimination factors such as",
> or maybe "or any other discrimination factors not listed here" to the
> original text. Or a simple "regardless of, for example, ...".

These sound like perfectly reasonable ways to address this. Please
consider submitting a patch upstream based on this reasoning.

Re: [Ksummit-discuss] [PATCH] code-of-conduct: Remove explicit list of discrimination factors

2018-10-17 Thread Josh Triplett

On Wed, Oct 17, 2018 at 06:32:36AM -0700, Guenter Roeck wrote:
> One could consider adding something like "discrimination factors such as",
> or maybe "or any other discrimination factors not listed here" to the
> original text. Or a simple "regardless of, for example, ...".

These sound like perfectly reasonable ways to address this. Please
consider submitting a patch upstream based on this reasoning.

Re: [Ksummit-discuss] [PATCH] code-of-conduct: Remove explicit list of discrimination factors

2018-10-17 Thread Josh Triplett

On Wed, Oct 17, 2018 at 11:31:35AM +0200, Geert Uytterhoeven wrote:
> Hi Josh,
> 
> Thanks for your comments!
> 
> On Wed, Oct 17, 2018 at 11:13 AM Josh Triplett  wrote:
> > On Wed, Oct 17, 2018 at 09:19:01AM +0200, Geert Uytterhoeven wrote:
> > > Providing an explicit list of discrimination factors may give the false
> > > impression that discrimination based on other unlisted factors would be
> > > allowed.
> >
> > This impression is, in fact, false, as has already been discussed
> > elsewhere. I had hoped that discussion would suffice.
> 
> The CoC FAQ is not part of the CoC, and not part of the Linux kernel.

I wasn't referring just to that; I'm referring to the discussion we've
already had on this exact point.

> > refers to. Listing explicit cases to cover does not imply other cases
> > are not covered;
> 
> It does, if not accompanied by "examples of...", like in the other sections.

"for everyone, regardless of ..." still says "for everyone", making the
"regardless of ..." inherently a non-exhaustive list of factors.

> > it does, however, ensure that the listed cases *are*,
> > and helps people know that they're covered.
> 
> So you agree people cannot know if the unlisted cases are covered or not?

People in underrepresented and commonly marginalized groups, especially
those more commonly overlooked, don't always know if a given group has
taken their particular group into account or given any thought to it.
Explicit inclusion helps, and this is a standard guideline often cited
for good codes of conduct.

That doesn't make other groups *not* covered. But *if* there is a
particular commonly marginalized group that you feel this should
*explicitly* cover and doesn't, I'd suggest *adding* that group rather
than deleting the existing effort to be explicitly inclusive. (And
again, I'd suggest doing so upstream first.)

> > This patch is not OK, and defeats one of the purposes of the original
> > change.
> 
> So the purpose of the original change was to list a number of factors,
> without saying that it was just a list of examples?

You seem to be actively trying to read something more into what I said.
One of the key purposes of the original change was to make the kernel a
"a welcoming environment to participate in", and to provide "explicit
guidelines".

Re: [Ksummit-discuss] [PATCH] code-of-conduct: Remove explicit list of discrimination factors

2018-10-17 Thread Josh Triplett

On Wed, Oct 17, 2018 at 11:31:35AM +0200, Geert Uytterhoeven wrote:
> Hi Josh,
> 
> Thanks for your comments!
> 
> On Wed, Oct 17, 2018 at 11:13 AM Josh Triplett  wrote:
> > On Wed, Oct 17, 2018 at 09:19:01AM +0200, Geert Uytterhoeven wrote:
> > > Providing an explicit list of discrimination factors may give the false
> > > impression that discrimination based on other unlisted factors would be
> > > allowed.
> >
> > This impression is, in fact, false, as has already been discussed
> > elsewhere. I had hoped that discussion would suffice.
> 
> The CoC FAQ is not part of the CoC, and not part of the Linux kernel.

I wasn't referring just to that; I'm referring to the discussion we've
already had on this exact point.

> > refers to. Listing explicit cases to cover does not imply other cases
> > are not covered;
> 
> It does, if not accompanied by "examples of...", like in the other sections.

"for everyone, regardless of ..." still says "for everyone", making the
"regardless of ..." inherently a non-exhaustive list of factors.

> > it does, however, ensure that the listed cases *are*,
> > and helps people know that they're covered.
> 
> So you agree people cannot know if the unlisted cases are covered or not?

People in underrepresented and commonly marginalized groups, especially
those more commonly overlooked, don't always know if a given group has
taken their particular group into account or given any thought to it.
Explicit inclusion helps, and this is a standard guideline often cited
for good codes of conduct.

That doesn't make other groups *not* covered. But *if* there is a
particular commonly marginalized group that you feel this should
*explicitly* cover and doesn't, I'd suggest *adding* that group rather
than deleting the existing effort to be explicitly inclusive. (And
again, I'd suggest doing so upstream first.)

> > This patch is not OK, and defeats one of the purposes of the original
> > change.
> 
> So the purpose of the original change was to list a number of factors,
> without saying that it was just a list of examples?

You seem to be actively trying to read something more into what I said.
One of the key purposes of the original change was to make the kernel a
"a welcoming environment to participate in", and to provide "explicit
guidelines".

Re: [Ksummit-discuss] [PATCH] code-of-conduct: Remove explicit list of discrimination factors

2018-10-17 Thread Josh Triplett

On Wed, Oct 17, 2018 at 09:19:01AM +0200, Geert Uytterhoeven wrote:
> Providing an explicit list of discrimination factors may give the false
> impression that discrimination based on other unlisted factors would be
> allowed.

This impression is, in fact, false, as has already been discussed
elsewhere. I had hoped that discussion would suffice.

As mentioned there: The original commit explicitly said "Explicit
guidelines have demonstrated success in other projects and other areas
of the kernel."; this is precisely the kind of explicit guideline it
refers to. Listing explicit cases to cover does not imply other cases
are not covered; it does, however, ensure that the listed cases *are*,
and helps people know that they're covered.

This patch is not OK, and defeats one of the purposes of the original
change.

Re: [Ksummit-discuss] [PATCH] code-of-conduct: Remove explicit list of discrimination factors

2018-10-17 Thread Josh Triplett

On Wed, Oct 17, 2018 at 09:19:01AM +0200, Geert Uytterhoeven wrote:
> Providing an explicit list of discrimination factors may give the false
> impression that discrimination based on other unlisted factors would be
> allowed.

This impression is, in fact, false, as has already been discussed
elsewhere. I had hoped that discussion would suffice.

As mentioned there: The original commit explicitly said "Explicit
guidelines have demonstrated success in other projects and other areas
of the kernel."; this is precisely the kind of explicit guideline it
refers to. Listing explicit cases to cover does not imply other cases
are not covered; it does, however, ensure that the listed cases *are*,
and helps people know that they're covered.

This patch is not OK, and defeats one of the purposes of the original
change.

Re: [Ksummit-discuss] [PATCH] code-of-conduct: Remove explicit list of discrimination factors

2018-10-10 Thread Josh Triplett

On Wed, Oct 10, 2018 at 01:55:04PM -0700, Frank Rowand wrote:
> On 10/07/18 01:51, Geert Uytterhoeven wrote:
> > Providing an explicit list of discrimination factors may give the false
> > impression that discrimination based on other unlisted factors would be
> > allowed.
> > 
> > Avoid any ambiguity by removing the list, to ensure "a harassment-free
> > experience for everyone", period.
[...]
> > diff --git a/Documentation/process/code-of-conduct.rst 
> > b/Documentation/process/code-of-conduct.rst
> > index ab7c24b5478c6b30..e472c9f86ff00b34 100644
> > --- a/Documentation/process/code-of-conduct.rst
> > +++ b/Documentation/process/code-of-conduct.rst
> > @@ -6,10 +6,7 @@ Our Pledge
> >  
> >  In the interest of fostering an open and welcoming environment, we as
> >  contributors and maintainers pledge to making participation in our project 
> > and
> > -our community a harassment-free experience for everyone, regardless of 
> > age, body
> > -size, disability, ethnicity, sex characteristics, gender identity and
> > -expression, level of experience, education, socio-economic status, 
> > nationality,
> > -personal appearance, race, religion, or sexual identity and orientation.
> > +our community a harassment-free experience for everyone.
> >  
> >  Our Standards
> >  =
> > 
> 
> 
> The words removed by this patch are a political statement.

Choosing not to say those words is a political statement.

See the original commit message for the code of conduct: "Explicit
guidelines have demonstrated success in other projects and other areas
of the kernel."

And see the FAQ entry at https://www.contributor-covenant.org/faq for
"The Contributor Covenant explicitly lists a set of protected classes;
does this make it acceptable to discriminate or make others feel
unwelcome based on other factors?" (I wrote that FAQ entry and submitted
it upstream, where it was enthusiastically merged.)

Re: [Ksummit-discuss] [PATCH] code-of-conduct: Remove explicit list of discrimination factors

2018-10-10 Thread Josh Triplett

On Wed, Oct 10, 2018 at 01:55:04PM -0700, Frank Rowand wrote:
> On 10/07/18 01:51, Geert Uytterhoeven wrote:
> > Providing an explicit list of discrimination factors may give the false
> > impression that discrimination based on other unlisted factors would be
> > allowed.
> > 
> > Avoid any ambiguity by removing the list, to ensure "a harassment-free
> > experience for everyone", period.
[...]
> > diff --git a/Documentation/process/code-of-conduct.rst 
> > b/Documentation/process/code-of-conduct.rst
> > index ab7c24b5478c6b30..e472c9f86ff00b34 100644
> > --- a/Documentation/process/code-of-conduct.rst
> > +++ b/Documentation/process/code-of-conduct.rst
> > @@ -6,10 +6,7 @@ Our Pledge
> >  
> >  In the interest of fostering an open and welcoming environment, we as
> >  contributors and maintainers pledge to making participation in our project 
> > and
> > -our community a harassment-free experience for everyone, regardless of 
> > age, body
> > -size, disability, ethnicity, sex characteristics, gender identity and
> > -expression, level of experience, education, socio-economic status, 
> > nationality,
> > -personal appearance, race, religion, or sexual identity and orientation.
> > +our community a harassment-free experience for everyone.
> >  
> >  Our Standards
> >  =
> > 
> 
> 
> The words removed by this patch are a political statement.

Choosing not to say those words is a political statement.

See the original commit message for the code of conduct: "Explicit
guidelines have demonstrated success in other projects and other areas
of the kernel."

And see the FAQ entry at https://www.contributor-covenant.org/faq for
"The Contributor Covenant explicitly lists a set of protected classes;
does this make it acceptable to discriminate or make others feel
unwelcome based on other factors?" (I wrote that FAQ entry and submitted
it upstream, where it was enthusiastically merged.)

Re: [Ksummit-discuss] [PATCH 1/2] code-of-conduct: Fix the ambiguity about collecting email addresses

2018-10-09 Thread Josh Triplett

On Tue, Oct 09, 2018 at 08:29:24PM +0200, Rainer Fiebig wrote:
> Am Montag, 8. Oktober 2018, 08:20:44 schrieb Josh Triplett:
> > On Sat, Oct 06, 2018 at 02:36:39PM -0700, James Bottomley wrote:
> > > The current code of conduct has an ambiguity in the it considers 
> > > publishing
> > > private information such as email addresses unacceptable behaviour.  Since
> > > the Linux kernel collects and publishes email addresses as part of the 
> > > patch
> > > process, add an exception clause for email addresses ordinarily collected 
> > > by
> > > the project to correct this ambiguity.
> > 
> > Upstream has now adopted a FAQ, which addresses this and many other
> > questions. See https://www.contributor-covenant.org/faq .
> > 
> > Might I suggest adding that link to the bottom of the document, instead?
> > (And then, optionally, submitting entries for that FAQ.)
> > 
> 
> The Code of Conflict has 28 lines, including the heading. 
> The Code of Conduct has 81 lines, including the heading. And it needs a FAQ. 
> Hm.

Yes, it turns out to be a more complicated problem than it was
previously oversimplified to. People don't automatically share a common
understanding.

Re: [Ksummit-discuss] [PATCH 1/2] code-of-conduct: Fix the ambiguity about collecting email addresses

2018-10-09 Thread Josh Triplett

On Tue, Oct 09, 2018 at 08:29:24PM +0200, Rainer Fiebig wrote:
> Am Montag, 8. Oktober 2018, 08:20:44 schrieb Josh Triplett:
> > On Sat, Oct 06, 2018 at 02:36:39PM -0700, James Bottomley wrote:
> > > The current code of conduct has an ambiguity in the it considers 
> > > publishing
> > > private information such as email addresses unacceptable behaviour.  Since
> > > the Linux kernel collects and publishes email addresses as part of the 
> > > patch
> > > process, add an exception clause for email addresses ordinarily collected 
> > > by
> > > the project to correct this ambiguity.
> > 
> > Upstream has now adopted a FAQ, which addresses this and many other
> > questions. See https://www.contributor-covenant.org/faq .
> > 
> > Might I suggest adding that link to the bottom of the document, instead?
> > (And then, optionally, submitting entries for that FAQ.)
> > 
> 
> The Code of Conflict has 28 lines, including the heading. 
> The Code of Conduct has 81 lines, including the heading. And it needs a FAQ. 
> Hm.

Yes, it turns out to be a more complicated problem than it was
previously oversimplified to. People don't automatically share a common
understanding.

Re: [Ksummit-discuss] [PATCH 1/2] code-of-conduct: Fix the ambiguity about collecting email addresses

2018-10-08 Thread Josh Triplett

On Mon, Oct 08, 2018 at 04:23:57PM -0300, Mauro Carvalho Chehab wrote:
> Em Mon, 08 Oct 2018 08:30:20 -0700
> James Bottomley  escreveu:
> 
> > On Mon, 2018-10-08 at 08:20 -0700, Josh Triplett wrote:
> > > On Sat, Oct 06, 2018 at 02:36:39PM -0700, James Bottomley wrote:  
> > > > The current code of conduct has an ambiguity in the it considers
> > > > publishing private information such as email addresses unacceptable
> > > > behaviour.  Since the Linux kernel collects and publishes email
> > > > addresses as part of the patch process, add an exception clause for
> > > > email addresses ordinarily collected by the project to correct this
> > > > ambiguity.  
> > > 
> > > Upstream has now adopted a FAQ, which addresses this and many other
> > > questions. See https://www.contributor-covenant.org/faq .
> > > 
> > > Might I suggest adding that link to the bottom of the document,
> > > instead? (And then, optionally, submitting entries for that FAQ.)  
> > 
> > We can debate that as part of everything else, but my personal opinion
> > would be we should never point to an outside document under someone
> > else's control for guidance as to how our community would enforce its
> > own code of conduct.
> 
> Fully agreed on that. The same argument that we use for GPL 2 only
> applies here: we should stick with an specific version of this it, in
> a way that we won't be automatically bound to whatever new version
> of it would say.

Linking to a FAQ with useful clarifications in it doesn't make those
"binding". This is *not* a legal agreement.

Re: [Ksummit-discuss] [PATCH 1/2] code-of-conduct: Fix the ambiguity about collecting email addresses

2018-10-08 Thread Josh Triplett

On Mon, Oct 08, 2018 at 04:23:57PM -0300, Mauro Carvalho Chehab wrote:
> Em Mon, 08 Oct 2018 08:30:20 -0700
> James Bottomley  escreveu:
> 
> > On Mon, 2018-10-08 at 08:20 -0700, Josh Triplett wrote:
> > > On Sat, Oct 06, 2018 at 02:36:39PM -0700, James Bottomley wrote:  
> > > > The current code of conduct has an ambiguity in the it considers
> > > > publishing private information such as email addresses unacceptable
> > > > behaviour.  Since the Linux kernel collects and publishes email
> > > > addresses as part of the patch process, add an exception clause for
> > > > email addresses ordinarily collected by the project to correct this
> > > > ambiguity.  
> > > 
> > > Upstream has now adopted a FAQ, which addresses this and many other
> > > questions. See https://www.contributor-covenant.org/faq .
> > > 
> > > Might I suggest adding that link to the bottom of the document,
> > > instead? (And then, optionally, submitting entries for that FAQ.)  
> > 
> > We can debate that as part of everything else, but my personal opinion
> > would be we should never point to an outside document under someone
> > else's control for guidance as to how our community would enforce its
> > own code of conduct.
> 
> Fully agreed on that. The same argument that we use for GPL 2 only
> applies here: we should stick with an specific version of this it, in
> a way that we won't be automatically bound to whatever new version
> of it would say.

Linking to a FAQ with useful clarifications in it doesn't make those
"binding". This is *not* a legal agreement.

Re: [Ksummit-discuss] [PATCH 2/2] code-of-conduct: Strip the enforcement paragraph pending community discussion

2018-10-08 Thread Josh Triplett

On Mon, Oct 08, 2018 at 02:15:25PM -0400, Chris Mason wrote:
> On 6 Oct 2018, at 17:37, James Bottomley wrote:
> > Significant concern has been expressed about the responsibilities
> > outlined in
> > the enforcement clause of the new code of conduct.  Since there is
> > concern
> > that this becomes binding on the release of the 4.19 kernel, strip the
> > enforcement clauses to give the community time to consider and debate
> > how this
> > should be handled.
> 
> Even in the places where I don't agree with the discussion about what our
> code of conduct should be, I love that we're having it.  Removing the
> enforcement clause basically goes back to the way things were.  We'd be
> recognizing that we know issues happen, and explicitly stating that when
> serious events do happen, the community as a whole isn't committing to
> helping.
> 
> It's true there are a lot of questions about how the community resolves
> problems and holds each other accountable for maintaining any code of
> conduct.  I think the enforcement section leaves us the room we need to
> continue discussions and still make it clear that we're making an effort to
> shift away from the harsh discussions in the past.

Emphatically seconded.

I absolutely agree that we should to work on the enforcement section
over time; for instance, I agree that a dedicated team (ideally with
some training) would be better than vesting this in a technical
decision-making body.

But I agree with Chris that we should not remove this entirely. And I
don't think there's any special significance to this being in the 4.19
release as compared to an -rc or git HEAD.

Re: [Ksummit-discuss] [PATCH 2/2] code-of-conduct: Strip the enforcement paragraph pending community discussion

2018-10-08 Thread Josh Triplett

On Mon, Oct 08, 2018 at 02:15:25PM -0400, Chris Mason wrote:
> On 6 Oct 2018, at 17:37, James Bottomley wrote:
> > Significant concern has been expressed about the responsibilities
> > outlined in
> > the enforcement clause of the new code of conduct.  Since there is
> > concern
> > that this becomes binding on the release of the 4.19 kernel, strip the
> > enforcement clauses to give the community time to consider and debate
> > how this
> > should be handled.
> 
> Even in the places where I don't agree with the discussion about what our
> code of conduct should be, I love that we're having it.  Removing the
> enforcement clause basically goes back to the way things were.  We'd be
> recognizing that we know issues happen, and explicitly stating that when
> serious events do happen, the community as a whole isn't committing to
> helping.
> 
> It's true there are a lot of questions about how the community resolves
> problems and holds each other accountable for maintaining any code of
> conduct.  I think the enforcement section leaves us the room we need to
> continue discussions and still make it clear that we're making an effort to
> shift away from the harsh discussions in the past.

Emphatically seconded.

I absolutely agree that we should to work on the enforcement section
over time; for instance, I agree that a dedicated team (ideally with
some training) would be better than vesting this in a technical
decision-making body.

But I agree with Chris that we should not remove this entirely. And I
don't think there's any special significance to this being in the 4.19
release as compared to an -rc or git HEAD.

Re: [Ksummit-discuss] [PATCH] code-of-conduct: Remove explicit list of discrimination factors

2018-10-08 Thread Josh Triplett

On Mon, Oct 08, 2018 at 04:42:47PM +0100, Alan Cox wrote:
> > In any case, this is not the appropriate place for such patches, any
> > more than it's the place for patches to the GPL.
> 
> I disagree. We had the GPLv2 or GPLv3 discussion on the kernel mailing
> list. The syscall clarification was discussed on the list. The
> EXPORT_SYMBOL and EXPORT_SYMBOL_GPL stuff was discussed on the list.

And discussion of this could occur on the list, as well. I said "for
such patches", not "for such discussion".

And, incidentally, this particular issues has now been addressed
upstream. I proposed a change to the FAQ, which has now been merged. See
https://www.contributor-covenant.org/faq and
https://github.com/ContributorCovenant/contributor_covenant/pull/612 .

Re: [Ksummit-discuss] [PATCH] code-of-conduct: Remove explicit list of discrimination factors

2018-10-08 Thread Josh Triplett

On Mon, Oct 08, 2018 at 04:42:47PM +0100, Alan Cox wrote:
> > In any case, this is not the appropriate place for such patches, any
> > more than it's the place for patches to the GPL.
> 
> I disagree. We had the GPLv2 or GPLv3 discussion on the kernel mailing
> list. The syscall clarification was discussed on the list. The
> EXPORT_SYMBOL and EXPORT_SYMBOL_GPL stuff was discussed on the list.

And discussion of this could occur on the list, as well. I said "for
such patches", not "for such discussion".

And, incidentally, this particular issues has now been addressed
upstream. I proposed a change to the FAQ, which has now been merged. See
https://www.contributor-covenant.org/faq and
https://github.com/ContributorCovenant/contributor_covenant/pull/612 .

Re: [Ksummit-discuss] [PATCH 1/2] code-of-conduct: Fix the ambiguity about collecting email addresses

2018-10-08 Thread Josh Triplett

On Sat, Oct 06, 2018 at 02:36:39PM -0700, James Bottomley wrote:
> The current code of conduct has an ambiguity in the it considers publishing
> private information such as email addresses unacceptable behaviour.  Since
> the Linux kernel collects and publishes email addresses as part of the patch
> process, add an exception clause for email addresses ordinarily collected by
> the project to correct this ambiguity.

Upstream has now adopted a FAQ, which addresses this and many other
questions. See https://www.contributor-covenant.org/faq .

Might I suggest adding that link to the bottom of the document, instead?
(And then, optionally, submitting entries for that FAQ.)

Re: [Ksummit-discuss] [PATCH 1/2] code-of-conduct: Fix the ambiguity about collecting email addresses

2018-10-08 Thread Josh Triplett

On Sat, Oct 06, 2018 at 02:36:39PM -0700, James Bottomley wrote:
> The current code of conduct has an ambiguity in the it considers publishing
> private information such as email addresses unacceptable behaviour.  Since
> the Linux kernel collects and publishes email addresses as part of the patch
> process, add an exception clause for email addresses ordinarily collected by
> the project to correct this ambiguity.

Upstream has now adopted a FAQ, which addresses this and many other
questions. See https://www.contributor-covenant.org/faq .

Might I suggest adding that link to the bottom of the document, instead?
(And then, optionally, submitting entries for that FAQ.)

Re: [Ksummit-discuss] [PATCH] code-of-conduct: Remove explicit list of discrimination factors

2018-10-07 Thread Josh Triplett

On Sun, Oct 07, 2018 at 08:18:26PM +0300, Laurent Pinchart wrote:
> Hi Josh,
> 
> On Sunday, 7 October 2018 14:35:14 EEST Josh Triplett wrote:
> > On Sun, Oct 07, 2018 at 10:51:02AM +0200, Geert Uytterhoeven wrote:
> > > Providing an explicit list of discrimination factors may give the false
> > > impression that discrimination based on other unlisted factors would be
> > > allowed.
> > > 
> > > Avoid any ambiguity by removing the list, to ensure "a harassment-free
> > > experience for everyone", period.
> > 
> > I would suggest reading the commit message that added this in the first
> > place. "Explicit guidelines have demonstrated success in other projects
> > and other areas of the kernel." See also various comparisons of codes of
> > conduct, which make the same point. The point of this list is precisely
> > to serve as one such explicit guideline; removing it would rather defeat
> > the purpose.
> > 
> > In any case, this is not the appropriate place for such patches, any
> > more than it's the place for patches to the GPL.
> 
> So what's an appropriate place to discuss the changes that we would like, 
> *together*, to make to the current document and propose upstream ?

I didn't say "not the appropriate place to discuss" (ksummit-discuss is
not ideal but we don't currently have somewhere better), I said "not the
appropriate place for such patches".

The Linux kernel is by no means the only project using the Contributor
Covenant. In general, we don't encourage people working on significant
changes to the Linux kernel to work in private for an extended period
and only pop up when "done"; rather, we encourage people to start
conversations early and include others in the design. Along the same
lines, I'd suggest that patches or ideas for patches belong upstream.
For instance, the idea of clarifying that email addresses already used
on a public mailing list don't count as "private information" seems like
a perfectly reasonable suggestion, and one that other projects would
benefit from as well.

Re: [Ksummit-discuss] [PATCH] code-of-conduct: Remove explicit list of discrimination factors

2018-10-07 Thread Josh Triplett

On Sun, Oct 07, 2018 at 08:18:26PM +0300, Laurent Pinchart wrote:
> Hi Josh,
> 
> On Sunday, 7 October 2018 14:35:14 EEST Josh Triplett wrote:
> > On Sun, Oct 07, 2018 at 10:51:02AM +0200, Geert Uytterhoeven wrote:
> > > Providing an explicit list of discrimination factors may give the false
> > > impression that discrimination based on other unlisted factors would be
> > > allowed.
> > > 
> > > Avoid any ambiguity by removing the list, to ensure "a harassment-free
> > > experience for everyone", period.
> > 
> > I would suggest reading the commit message that added this in the first
> > place. "Explicit guidelines have demonstrated success in other projects
> > and other areas of the kernel." See also various comparisons of codes of
> > conduct, which make the same point. The point of this list is precisely
> > to serve as one such explicit guideline; removing it would rather defeat
> > the purpose.
> > 
> > In any case, this is not the appropriate place for such patches, any
> > more than it's the place for patches to the GPL.
> 
> So what's an appropriate place to discuss the changes that we would like, 
> *together*, to make to the current document and propose upstream ?

I didn't say "not the appropriate place to discuss" (ksummit-discuss is
not ideal but we don't currently have somewhere better), I said "not the
appropriate place for such patches".

The Linux kernel is by no means the only project using the Contributor
Covenant. In general, we don't encourage people working on significant
changes to the Linux kernel to work in private for an extended period
and only pop up when "done"; rather, we encourage people to start
conversations early and include others in the design. Along the same
lines, I'd suggest that patches or ideas for patches belong upstream.
For instance, the idea of clarifying that email addresses already used
on a public mailing list don't count as "private information" seems like
a perfectly reasonable suggestion, and one that other projects would
benefit from as well.

Re: [Ksummit-discuss] [PATCH] code-of-conduct: Remove explicit list of discrimination factors

2018-10-07 Thread Josh Triplett

On Sun, Oct 07, 2018 at 10:51:02AM +0200, Geert Uytterhoeven wrote:
> Providing an explicit list of discrimination factors may give the false
> impression that discrimination based on other unlisted factors would be
> allowed.
> 
> Avoid any ambiguity by removing the list, to ensure "a harassment-free
> experience for everyone", period.

I would suggest reading the commit message that added this in the first
place. "Explicit guidelines have demonstrated success in other projects
and other areas of the kernel." See also various comparisons of codes of
conduct, which make the same point. The point of this list is precisely
to serve as one such explicit guideline; removing it would rather defeat
the purpose.

In any case, this is not the appropriate place for such patches, any
more than it's the place for patches to the GPL.

Re: [Ksummit-discuss] [PATCH] code-of-conduct: Remove explicit list of discrimination factors

2018-10-07 Thread Josh Triplett

On Sun, Oct 07, 2018 at 10:51:02AM +0200, Geert Uytterhoeven wrote:
> Providing an explicit list of discrimination factors may give the false
> impression that discrimination based on other unlisted factors would be
> allowed.
> 
> Avoid any ambiguity by removing the list, to ensure "a harassment-free
> experience for everyone", period.

I would suggest reading the commit message that added this in the first
place. "Explicit guidelines have demonstrated success in other projects
and other areas of the kernel." See also various comparisons of codes of
conduct, which make the same point. The point of this list is precisely
to serve as one such explicit guideline; removing it would rather defeat
the purpose.

In any case, this is not the appropriate place for such patches, any
more than it's the place for patches to the GPL.

Re: [kconfig-sat] [ANN] init-kconfig - easy way to embrace Linux's kconfig

2018-10-04 Thread Josh Triplett

On Thu, Oct 04, 2018 at 01:39:50PM -0700, Luis Chamberlain wrote:
> On Thu, Oct 04, 2018 at 01:09:00PM -0700, Josh Triplett wrote:
> > On Thu, Oct 04, 2018 at 01:02:49PM -0700, Luis Chamberlain wrote:
> > > Every now and then a project is born, and they decide to use Linux's
> > > kconfig to enable configuration of their project. As it stands we *know*
> > > kconfig is now used in at least over 12 different projects [0]. I myself
> > > added kconfig to one as well years ago. Even research reveals that
> > > kconfig has become one of the leading industrial variability modeling
> > > languages [1] [2].
> > > 
> > > What is often difficult to do though is to start off using kconfig and
> > > integrating it into a project. Or updating / syncing to the latest
> > > kconfig from upstream Linux.
> > > 
> > > I had yet another need to use kconfig for another small project so
> > > decided to make a clean template others can use and help keep it in sync.
> > > This is a passive fork which aims to keep in sync with the Linux
> > > kernel's latest kconfig to make it easier to keep up to date and to
> > > enable new projects to use and embrace kconfig on their own.  The goal
> > > is *not* to fork kconfig and evolve it separately, but rather keep in
> > > sync with the evolution of kconfig on Linux to make it easier for
> > > projects to use kconfig and also update their own kconfig when needed.
> > 
> > Is there a *fundamental* reason that we couldn't have this *be* Linux
> > kconfig, whether pulled in by submodule or regular merges, and avoid
> > having any divergence at all?
> 
> The structure of kconfig in Linux would have to be changed to make
> adoption and sync easier. If that is a goal we wish to embrace, I'm
> all for it.

I would *love* to see Kconfig in Linux evolved to be more easily reused.

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2693 matches

Mail list logo