Re: Please review: lookup changes
On Mon, Mar 09, 2020 at 07:59:03PM +, Andrew Doran wrote: > > Why don't we check our watch immediately when we set IN_CHANGE , > > and defer the disk write to update the inode like UFS_WAPBL_UPDATE > > does -- and just skip UFS_ITIMES in VOP_GETATTR? > > This looks to go back least as far as Unix 32V (& possibly further) in one > form or another, from what I can see. Whatever the original intent I think > it might have morphed into an attempt to avoid querying the system clock at > some point, which is about the only point I can see for it now. Considering this originates in a different era.. I suppose it might also yield a better chance of observing distinct timestamps, if the system had poor clock resolution. Andrew
Re: Please review: lookup changes
On Sun, Mar 08, 2020 at 09:22:28PM +, Taylor R Campbell wrote: > > Date: Thu, 5 Mar 2020 22:48:25 + > > From: Andrew Doran > > > > I'd like to merge the changes on the ad-namecache branch, and would > > appreciate any code review. The motivation for these changes is, as > > you might guess, performance. I put an overview below. I won't be > > merging this for a couple of weeks in any case. > > Thanks for working on this. Lookups are a big point of contention, so > it's worth putting effort into fixing that. At the same time, I want > to make sure we don't regress into the vfs locking insanity we had ten > years ago; dholland@ and especially hannken@ have put a lot of work > into cleaning it up and making it understandable and maintainable. > Please make sure to wait until hannken@ has had an opportunity weigh > in on it. Absolutely. Some philosophical ramblings: I see the namecache as a thing that allows you to avoid re-doing expensive stuff that you've already done once before - whether that's disc I/O and metadata crawls (old application) or multiprocessor locking (the app here). Were it 1993 I might well be up for ripping out vnode locking and going back to inode locking, but we're way too far down the chosen path now, and there has been major effort invested by many people over the years to work the vnode locking into shape, as you note. It works very well now, and I don't think we should mess with it too much. With that in mind I have tried do this in a way that requires as few changes to the vnode locking and file systems as possible: use the namecache to do things fast whenever possible, but preserve the strongly locked path. > > If the file system exports IMNT_NCLOOKUP, this now tries to "fast > > forward" through as many component names as possible by looking > > directly in the namecache, without taking vnode locks nor vnode > > refs, and stops when the leaf is reached or there is something that > > can't be handled using the namecache. In the most optimistic case a > > vnode reference is taken only at the root and the leaf. > > IMNT_NCLOOKUP is implemented right now only for tmpfs, ffs. > > What file systems can't support IMNT_NCLOOKUP and why? Good question. Other than layered file systems I don't know for sure, but I think it can be done for most FSes. NFS might have its own complications; I haven't looked yet. IMNT_NCLOOKUP is more of a feature gate than anything else. The file system hooks are unintrusive and few in number but adding those hooks takes careful, time consuming code inspection to make sure nothing is missed. > > Layered file systems can't work that way (names are cached below), > > It seems there's a tradeoff here: > > 1. We could cache layered vnodes, at the cost of doubling the size of >the namecache. > > 2. We could _not_ cache layered vnodes, at the cost of having to redo >layer_node_create and vcache_get for every lookup -- a process >which scales worse than namecache lookups (global vcache_lock). > > Am I missing anything else about this? Is this tradeoff the only > reason that it's still up to the vnode operations to call cache_lookup > and cache_enter, rather than something namei does directly -- so that > layer_lookup can skip them? As far as I recall layer vnodes persist unless recycled like any other vnode, so there should be nothing special happening there around contention due to lifecycle. I think you could cache or transpose the cached names above, maybe using some kind of VOP. I definitely think it's doable but beyond that, at least for me thare lots of unknowns so I have avoided it. > > and the permissions check won't work for anything unusual like ACLs, > > so there is also IMNT_SHRLOOKUP. If the file system exports > > IMNT_SHRLOOKUP, then VOP_LOOKUP() for "read only" lookups (e.g. not > > RENAME & friends) is tried first with an LK_SHARED lock on the > > directory. For an FS that can do the whole lookup this way (e.g. > > tmpfs) all is good. Otherwise the FS is free to return ENOLCK from > > VOP_LOOKUP() if something tricky comes up, and the lookup gets > > retried with LK_EXCLUSIVE (e.g. ffs when it needs to scan directory > > metadata). > > Except for access to the create/rename/remove hints (which we should > do differently anyway [*]), in what circumstances does ufs lookup need > an exclusive lock? Why does scanning directory metadata require that? > > [*] Specifically: We should push the last component lookup into > directory operations, and leave it to *fs_rename/remove/ to invoke > an internal lookup routine that does the right thing with locks and > returns the hint to the caller rather than stashing it in the > directory's struct inode. This has been intended for several years, > but dholland and I have had a shortage of round tuits." I was thinking of the sequential scan optimisation from 4.3BSD ("2passes" in the
Re: NULL pointer arithmetic issues
On Mon, Mar 9, 2020 at 2:53 PM Martin Husemann wrote: > > On Mon, Mar 09, 2020 at 12:41:37PM -0400, Aaron Ballman wrote: > > > You could view NULL as a special pointer pointing to an inaccessible > > > zero sized object. Adding 0 to it still points to the same special > > > non-object. I guess that is how the C++ folks see it. > > > > This wording has always been UB in C. It stopped being UB in C++20 but > > was UB prior to that. I did not see a core issue pertaining to this > > (though it's possible I've missed it -- the C++ core issues list is a > > bit hard to track these days), so I am not certain why this change was > > made so recently. > > AFAICT it has always been well defined in C++, though the wording changed > a bit. In C++11 (n3337) it is 5.7 note 7: "If the value of 0 is added or > subtracted from a pointer value, the result compares equal to the original > pointer value." Same text but moved one note down to 8 in n3797 (C++14). > N4659 (C++17) makes it explicit for NULL pointers in 8.7 note 7 (as > Taylor cited earlier in this thread): "If the value 0 is added or subtracted > from a null pointer value, the result is a null pointer value." Yes, that's my mistake, sorry about that misinformation! (Taylor also brought it up off-list.) ~~Aaron > > Martin
Re: NULL pointer arithmetic issues
On Mon, Mar 09, 2020 at 12:41:37PM -0400, Aaron Ballman wrote: > > You could view NULL as a special pointer pointing to an inaccessible > > zero sized object. Adding 0 to it still points to the same special > > non-object. I guess that is how the C++ folks see it. > > This wording has always been UB in C. It stopped being UB in C++20 but > was UB prior to that. I did not see a core issue pertaining to this > (though it's possible I've missed it -- the C++ core issues list is a > bit hard to track these days), so I am not certain why this change was > made so recently. AFAICT it has always been well defined in C++, though the wording changed a bit. In C++11 (n3337) it is 5.7 note 7: "If the value of 0 is added or subtracted from a pointer value, the result compares equal to the original pointer value." Same text but moved one note down to 8 in n3797 (C++14). N4659 (C++17) makes it explicit for NULL pointers in 8.7 note 7 (as Taylor cited earlier in this thread): "If the value 0 is added or subtracted from a null pointer value, the result is a null pointer value." Martin
Re: NULL pointer arithmetic issues
On Mon, Mar 9, 2020 at 12:36 PM Joerg Sonnenberger wrote: > > On Mon, Mar 09, 2020 at 09:50:50AM -0400, Aaron Ballman wrote: > > On Sun, Mar 8, 2020 at 2:30 PM Taylor R Campbell > > > I ask because in principle a conformant implementation could compile > > > the NetBSD kernel into a useless blob that does nothing -- we rely on > > > all sorts of behaviour relative to a real physical machine that is not > > > defined by the letter of the standard, like inline asm, or converting > > > integers from the VM system's virtual address allocator into pointers > > > to objects. But such an implementation would not be useful. > > > > Whether an optimizer elects to use this form of UB to make > > optimization decisions is a matter of QoI. My personal feeling is that > > I don't trust this sort of optimization -- it takes code the > > programmer wrote and makes it behave in a fundamentally different > > manner. I'm in support of UBSan diagnosing these constructs because it > > is UB and an optimizer is definitely allowed to optimize based on it > > but I wouldn't be in support of an optimizer that aggressively > > optimizes on this. > > I consider it as something even worse. Just like the case of passing > NULL pointers to memcpy and friends with zero as size, this > interpretation / restriction in the standard is actively harmful to some > code for the sake of potential optimisation opportunities in other code. > It seems to be a poor choice at that. I.e. it requires adding > conditional branches for something that behaves sanely everywhere but > may the DS9k. +1 -- I'm personally not a fan of allowing these kinds of optimizations. I refer to the entire class of similar operations as "exploiting undefined behavior" for this reason. ~~Aaron > > Joerg
Re: NULL pointer arithmetic issues
On Mon, Mar 9, 2020 at 11:51 AM Martin Husemann wrote: > > On Mon, Mar 09, 2020 at 09:38:31AM -0400, Aaron Ballman wrote: > > The way I read this is: > > > > "If the pointer operand points to an element of an array object" -- it > > does not (null is not a valid array object). > > "Moreover, if the expression P points to the last element of an array > > object" -- it does not (null is not a valid array object). > > "If both the pointer operand and the result point to elements of the > > same array object, or one past the last element of the array > > object..." -- it does not (there is no valid array so they cannot > > point to elements of the same array). > > All fine. > > > "...otherwise, the behavior is undefined." -- this is where we hit the UB. > > But it started with "If the pointer operand points to an element of an > array object" and my reading is that the whole section does not apply. I don't think that's the correct predicate to read. The full sentence I reach is: "If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined." The pointer operand and the result do not point to elements of the same array object or one past the last element, so I hit the "otherwise" case. But even under your interpretation, I see the result as being UB -- anything the standard doesn't explicitly specify is also undefined behavior, so if the whole section doesn't apply, nothing else specifies the behavior. > NULL is special, and we did not change it's value, so it stays a NULL > pointer. The section you cited just does not tell anything about arithmetic > on NULL pointers - and the wording makes it sound like this is more an > oversight than a concious descision. That was why I asked on the reflectors; there was sufficient confusion from people reading the standard to warrant asking "are we reading this properly" and the answer was: yes, we are interpreting it properly and that NULL + 0 is UB. > Eespecially adding or subtracting zero. > > You could view NULL as a special pointer pointing to an inaccessible > zero sized object. Adding 0 to it still points to the same special > non-object. I guess that is how the C++ folks see it. This wording has always been UB in C. It stopped being UB in C++20 but was UB prior to that. I did not see a core issue pertaining to this (though it's possible I've missed it -- the C++ core issues list is a bit hard to track these days), so I am not certain why this change was made so recently. ~~Aaron > > > Martin
Re: NULL pointer arithmetic issues
On Mon, Mar 09, 2020 at 09:50:50AM -0400, Aaron Ballman wrote: > On Sun, Mar 8, 2020 at 2:30 PM Taylor R Campbell > > I ask because in principle a conformant implementation could compile > > the NetBSD kernel into a useless blob that does nothing -- we rely on > > all sorts of behaviour relative to a real physical machine that is not > > defined by the letter of the standard, like inline asm, or converting > > integers from the VM system's virtual address allocator into pointers > > to objects. But such an implementation would not be useful. > > Whether an optimizer elects to use this form of UB to make > optimization decisions is a matter of QoI. My personal feeling is that > I don't trust this sort of optimization -- it takes code the > programmer wrote and makes it behave in a fundamentally different > manner. I'm in support of UBSan diagnosing these constructs because it > is UB and an optimizer is definitely allowed to optimize based on it > but I wouldn't be in support of an optimizer that aggressively > optimizes on this. I consider it as something even worse. Just like the case of passing NULL pointers to memcpy and friends with zero as size, this interpretation / restriction in the standard is actively harmful to some code for the sake of potential optimisation opportunities in other code. It seems to be a poor choice at that. I.e. it requires adding conditional branches for something that behaves sanely everywhere but may the DS9k. Joerg
Re: NULL pointer arithmetic issues
On Mon, Mar 09, 2020 at 09:38:31AM -0400, Aaron Ballman wrote: > The way I read this is: > > "If the pointer operand points to an element of an array object" -- it > does not (null is not a valid array object). > "Moreover, if the expression P points to the last element of an array > object" -- it does not (null is not a valid array object). > "If both the pointer operand and the result point to elements of the > same array object, or one past the last element of the array > object..." -- it does not (there is no valid array so they cannot > point to elements of the same array). All fine. > "...otherwise, the behavior is undefined." -- this is where we hit the UB. But it started with "If the pointer operand points to an element of an array object" and my reading is that the whole section does not apply. NULL is special, and we did not change it's value, so it stays a NULL pointer. The section you cited just does not tell anything about arithmetic on NULL pointers - and the wording makes it sound like this is more an oversight than a concious descision. Eespecially adding or subtracting zero. You could view NULL as a special pointer pointing to an inaccessible zero sized object. Adding 0 to it still points to the same special non-object. I guess that is how the C++ folks see it. Martin
Re: modload(8) v.s. alias
Our in-kernel linker does not understand weak references. It would be a lot of work to implement, and there are some issues that would need to be considered carefully. For example, if module A defines a weak alias, and module B is loaded and resolves the alias, what do you do if module C gets loaded and we discover a "strong" definition of the same symbol? Do you keep the original resolution, or do you re-resolve it to the new definition? On Mon, 9 Mar 2020, Rin Okuyama wrote: Hi, modload(8) fails if some depended functions are alias, at least on m68k. For example: # uname -a NetBSD 9.99.49 NetBSD 9.99.49 (GENERIC) #6: Mon Mar 9 22:53:07 JST 2020 rin@latipes:/build/src/sys/arch/sun2/compile/GENERIC sun2 # modload nfs [xxx.xxx] kobj_checksyms, 998: [nfs]: linker error: symbol '__ffssi2' not found [xxx.xxx] WARNING: module error: unable to affix module 'nfs', error 8 modload: nfs: Exec format error # Here, __ffssi4 is weak alias of ffs on m68k: # nm /netbsd | grep __ffssi2 0012327c W __ffssi2 # nm /netbsd | grep 0012327c 0012327c W __ffssi2 0012327c T ffs # If this symbol is turned into a ``real'' object by attached patch, modload(8) and loaded module work fine: # nm /netbsd | grep __ffssi4 0012327c T __ffssi2 # modload nfs # mount_nfs server:/path /mnt # ls /mnt foo bar baz ... # Is this a bug or feature? How can I fix this problem? Thanks, rin Index: common/lib/libc/arch/m68k/string/ffs.S === RCS file: /cvsroot/src/common/lib/libc/arch/m68k/string/ffs.S,v retrieving revision 1.7 diff -p -u -r1.7 ffs.S --- common/lib/libc/arch/m68k/string/ffs.S 9 Mar 2020 13:36:10 - 1.7 +++ common/lib/libc/arch/m68k/string/ffs.S 9 Mar 2020 14:09:13 - @@ -45,7 +45,11 @@ /* bit = ffs(value) */ +#ifdef _LIBC WEAK_ALIAS(__ffssi2,ffs) +#else /* KERNEL */ +#define ffs __ffssi2 +#endif #if (!defined(__mc68010__) && !defined(__mcoldfire__)) || defined(__mcfisac__) !DSPAM:5e664fb8246343444541299! ++--+---+ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired) | FA29 0E3B 35AF E8AE 6651 | p...@whooppee.com | | Software Developer | 0786 F758 55DE 53BA 7731 | pgoye...@netbsd.org | ++--+---+
Re: NULL pointer arithmetic issues
On Sun, Mar 8, 2020 at 2:30 PM Taylor R Campbell wrote: > > > Date: Sun, 8 Mar 2020 20:52:29 +0300 > > From: Roman Lebedev > > > > so we are allowed to lower that in clang front-end as: > > > > int > > statu(char *a) > > { > > __builtin_assume(a != NULL); > > a += getuid() - geteuid(); > > __builtin_assume(a != NULL); > > Allowed, yes. > > What I'm wondering is whether this is something Clang will actually do > -- and whether it is for any serious reason other than to capriciously > screw people who write software for real machines instead of for the > pure C abstract machine to the letter of the standard (and if so, > whether -fno-delete-null-pointer-checks is enough to disable it). > > Evidently making that assumption _not_ allowed in C++, so presumably > it's not important for performance purposes. It's also not important > for expressive purposes, because I could just as well have written > assert(a != NULL). > > > > I was told by Roman that it was checked during a C committee meeting and > > > confirmed to be an intentional UB. > > Correction: Aaron Ballman asked about this in non-public WG14 reflector > > mailing list, it wasn't a committee meeting, but the point still stands. > > What does `intentional' UB mean, exactly? What is the intent behind > having p + 0 for null p be undefined in C, while the C++ committee saw > fit to define it? Intentional as in: the question was considered and it was decided to make that scenario explicitly be UB (as opposed to it being UB because we didn't say anything about it in the standard or never considered the question in the first place). > Is it because there technically once existed C implementations on > bizarre architectures with wacky arithmetic on pointers like Zeta-C, > or is it because there are actual useful consequences to inferring > that p must be nonnull if we evaluate p + 0? As for intent, I can only speculate (I wasn't there for the original specification work on this. I might not have even been alive for the original specification work on this.): performance gains. For instance, a segmented memory architecture may have multiple value representations of a null pointer. Even a flat memory architecture can see potential optimization gains from making an assumption that arithmetic on a pointer means the pointer is valid. > I ask because in principle a conformant implementation could compile > the NetBSD kernel into a useless blob that does nothing -- we rely on > all sorts of behaviour relative to a real physical machine that is not > defined by the letter of the standard, like inline asm, or converting > integers from the VM system's virtual address allocator into pointers > to objects. But such an implementation would not be useful. Whether an optimizer elects to use this form of UB to make optimization decisions is a matter of QoI. My personal feeling is that I don't trust this sort of optimization -- it takes code the programmer wrote and makes it behave in a fundamentally different manner. I'm in support of UBSan diagnosing these constructs because it is UB and an optimizer is definitely allowed to optimize based on it but I wouldn't be in support of an optimizer that aggressively optimizes on this. ~~Aaron
modload(8) v.s. alias
Hi, modload(8) fails if some depended functions are alias, at least on m68k. For example: # uname -a NetBSD 9.99.49 NetBSD 9.99.49 (GENERIC) #6: Mon Mar 9 22:53:07 JST 2020 rin@latipes:/build/src/sys/arch/sun2/compile/GENERIC sun2 # modload nfs [xxx.xxx] kobj_checksyms, 998: [nfs]: linker error: symbol '__ffssi2' not found [xxx.xxx] WARNING: module error: unable to affix module 'nfs', error 8 modload: nfs: Exec format error # Here, __ffssi4 is weak alias of ffs on m68k: # nm /netbsd | grep __ffssi2 0012327c W __ffssi2 # nm /netbsd | grep 0012327c 0012327c W __ffssi2 0012327c T ffs # If this symbol is turned into a ``real'' object by attached patch, modload(8) and loaded module work fine: # nm /netbsd | grep __ffssi4 0012327c T __ffssi2 # modload nfs # mount_nfs server:/path /mnt # ls /mnt foo bar baz ... # Is this a bug or feature? How can I fix this problem? Thanks, rin Index: common/lib/libc/arch/m68k/string/ffs.S === RCS file: /cvsroot/src/common/lib/libc/arch/m68k/string/ffs.S,v retrieving revision 1.7 diff -p -u -r1.7 ffs.S --- common/lib/libc/arch/m68k/string/ffs.S 9 Mar 2020 13:36:10 - 1.7 +++ common/lib/libc/arch/m68k/string/ffs.S 9 Mar 2020 14:09:13 - @@ -45,7 +45,11 @@ /* bit = ffs(value) */ +#ifdef _LIBC WEAK_ALIAS(__ffssi2,ffs) +#else /* KERNEL */ +#define ffs __ffssi2 +#endif #if (!defined(__mc68010__) && !defined(__mcoldfire__)) || defined(__mcfisac__)
Re: NULL pointer arithmetic issues
On Mon, Mar 9, 2020 at 9:21 AM Martin Husemann wrote: > > On Mon, Mar 09, 2020 at 01:34:23PM +0100, Kamil Rytarowski wrote: > > > We instruct a C compiler that pointer used in the pserialize macros is > > never NULL, as the side effect of adding to it 0. > > I question that side effect. > > The C++ standard disagrees with your interpration, I have not seen clear > statements in any official C standard, and we have been discussing here > tons of totally unrelated things. Untill proven otherwise I consider that > ubsan behaviour a bug. C17 6.5.6p8 says: When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i + n-th and i - n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated. The way I read this is: "If the pointer operand points to an element of an array object" -- it does not (null is not a valid array object). "Moreover, if the expression P points to the last element of an array object" -- it does not (null is not a valid array object). "If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object..." -- it does not (there is no valid array so they cannot point to elements of the same array). "...otherwise, the behavior is undefined." -- this is where we hit the UB. When discussed on the committee reflector there were no objections to that interpretation, so this is my understanding of the C committee's current position. If you still think this is unclear, you can file a clarification request with the committee to get an official response considered by the full committee. ~~Aaron
Re: NULL pointer arithmetic issues
On Mon, Mar 09, 2020 at 01:34:23PM +0100, Kamil Rytarowski wrote: > We instruct a C compiler that pointer used in the pserialize macros is > never NULL, as the side effect of adding to it 0. I question that side effect. The C++ standard disagrees with your interpration, I have not seen clear statements in any official C standard, and we have been discussing here tons of totally unrelated things. Untill proven otherwise I consider that ubsan behaviour a bug. Martin
Re: NULL pointer arithmetic issues
On 09.03.2020 07:05, Martin Husemann wrote: > Also note that the getuid()/geteuid() example here is IMHO unrelated to the > original issue that caused this discussion, so I am not even convinced this > is NOT a ubsan bug. We instruct a C compiler that pointer used in the pserialize macros is never NULL, as the side effect of adding to it 0. As the pointer can be NULL, this at least confuses the compiler and can result in a miscompilation. We workaround it today with -fno-delete-null-pointer-checks in RUMP. In regular userland we shall avoid NULL pointer arithmetic. signature.asc Description: OpenPGP digital signature
Re: NULL pointer arithmetic issues
On Sun, Mar 08, 2020 at 09:58:05PM +0100, Kamil Rytarowski wrote: > Aaron (as he was mentioned by name), is a voting member in the C++ > committee and actively working on harmonizing C and C++ standards. There > is a good chance that C and C++ will be synced here (they definitely > should). Yes, they should. In this regard one of them is sane and one is totaly absurd and actively harmfull to existing working code. Unbreak C, please. Also note that the getuid()/geteuid() example here is IMHO unrelated to the original issue that caused this discussion, so I am not even convinced this is NOT a ubsan bug. Martin