Re: CVS commit: src/lib/libc
> Date: Sat, 8 Jun 2024 11:51:43 +0200 > From: Roland Illig > > Am 07.06.2024 um 22:50 schrieb Taylor R Campbell: > > libc: Pacify lint on aarch64. > > > > +++ src/lib/libc/stdlib/Makefile.incFri Jun 7 20:50:13 2024 > > +# lint(1) spuriously complains about `*s == CHAR_MAX' even though *s > > +# has type char. > > +LINTFLAGS.strfmon.c += -X 230 > > I guess the "spuriously" here means "on platforms where 'char' is > unsigned", not "sometimes on the same platform, unpredictably". Correct. > Lint does not warn if the constant is defined as a character constant, > so '\xff' instead of 0xff would work, but I don't know what else would > be affected if the definition of CHAR_MAX were changed. Might work but I'm reluctant to try without extensive testing on a lot of compilers in a lot of environments. E.g., does that work in C89? > Ideally, lint would not warn about this expression, but since lint only > looks at the preprocessed translation unit, it cannot know that the 0xff > comes from CHAR_MAX and thus is fine. Practically, suppressing the > warning in this particular case makes sense. > > Any ideas how to resolve this situation? No brilliant ideas, sorry, other than to teach lint to track provenance of constants through macro expansion. That's why I just disabled the warning for this case.
Re: CVS commit: src/lib/libc
Am 07.06.2024 um 22:50 schrieb Taylor R Campbell: > libc: Pacify lint on aarch64. > > +++ src/lib/libc/stdlib/Makefile.inc Fri Jun 7 20:50:13 2024 > +# lint(1) spuriously complains about `*s == CHAR_MAX' even though *s > +# has type char. > +LINTFLAGS.strfmon.c += -X 230 I guess the "spuriously" here means "on platforms where 'char' is unsigned", not "sometimes on the same platform, unpredictably". When CHAR_MAX is defined as 0xff, lint assumes that the same constant with the same value is used on other platforms as well, thus the warning. This is due to lint's -p flag for portability checks. Lint does not warn if the constant is defined as a character constant, so '\xff' instead of 0xff would work, but I don't know what else would be affected if the definition of CHAR_MAX were changed. Ideally, lint would not warn about this expression, but since lint only looks at the preprocessed translation unit, it cannot know that the 0xff comes from CHAR_MAX and thus is fine. Practically, suppressing the warning in this particular case makes sense. Any ideas how to resolve this situation? Roland
Re: CVS commit: src/lib/libc/time
Valery Ushakov wrote in : |On Fri, Dec 08, 2023 at 01:32:49 +0300, Valery Ushakov wrote: | |> On Thu, Dec 07, 2023 at 20:13:37 +, Robert Elz wrote: |> |>> While here, consistemntly use minus when minus is meant, rather that |>> just using a hyphen. |> |> One has to be careful with this. | |And to have this on record for refernce: https://lwn.net/Articles/947941/ 'Could be you like that: https://lists.gnu.org/archive/html/groff/2022-09/msg00048.html (or these https://lists.gnu.org/archive/html/groff/2022-09/msg00053.html https://lists.gnu.org/archive/html/groff/2022-09/msg00057.html heck i could vomit all thread long. :-) |-uwe --End of Ciao. --steffen | |Der Kragenbaer,The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt) | | Only in December: lightful Dubai COP28 Narendra Modi quote: | A small part of humanity has ruthlessly exploited nature. | But the entire humanity is bearing the cost of it, | especially the inhabitants of the Global South. | The selfishness of a few will lead the world into darkness, | not just for themselves but for the entire world.
Re: CVS commit: src/lib/libc/time
On Fri, Dec 08, 2023 at 01:32:49 +0300, Valery Ushakov wrote: > On Thu, Dec 07, 2023 at 20:13:37 +, Robert Elz wrote: > > > While here, consistemntly use minus when minus is meant, rather that > > just using a hyphen. > > One has to be careful with this. And to have this on record for refernce: https://lwn.net/Articles/947941/ -uwe
Re: CVS commit: src/lib/libc/time
On Thu, Dec 07, 2023 at 20:13:37 +, Robert Elz wrote: > While here, consistemntly use minus when minus is meant, rather that > just using a hyphen. One has to be careful with this. In the literal context (Ql, Li, etc) the ascii minus-hyphen in the input is preserved as such. In other contexts you will get a math minus sign in PS/PDF output, or groff UTF-8 text output, which makes man page not copy-pastable. And wide (about the size of en-dash) minus sign looks rather ugly in non math contexts like 02:00 CST (−06) in PS/PDF. -uwe
Re: CVS commit: src/lib/libc/ssp
On Wednesday, November 15, 2023 4:15:28 AM CET you wrote: > Module Name: src > Committed By: christos > Date: Wed Nov 15 03:15:28 UTC 2023 > > Modified Files: > src/lib/libc/ssp: Makefile.inc > Added Files: > src/lib/libc/ssp: ssp_redirect.c > > Log Message: > provide materialized functions for the ssp overriden inlines The functions are supposed to be transparent and they used to be. Can we please just go back to the working state before? IMO wanting to overriding getcwd is absolutely no justification for this. If the prototype (and inline function) is visible from the header, userland should *not* be abled to interpose it. If it is not visible due to standard headers, there was no problem in first place. Joerg
re: CVS commit: src/lib/libc/stdlib
> Minor changes to jemalloc100 (the old one that only vax etc currently uses). thanks. i'm still using this version on a bunch of modern machines. new jemalloc was problematic for a few things for me a number of years ago and i keep meaning to test again, but for now i'm still mostly using this version everwhere. FYI. .mrg.
Re: CVS commit: src/lib/libc/time
Date:Wed, 26 Oct 2022 10:42:15 -0400 From:Jan Schaumann Message-ID: | New proposal: That looks better (it needs some minor wording changes, there are too many indefinite articles ('a') which aren't needed, and there midnight should be just that, no hyphen or space -- but that's all trivia). kre
Re: CVS commit: src/lib/libc/time
Robert Elz wrote: > Date:Sun, 23 Oct 2022 13:53:01 -0400 > From:Jan Schaumann > Message-ID: > > | Hmm, maybe something like this? > > I think there is still too much there, you don't have > to explain everything (or almost anything), but it is > in the right direction I think. New proposal: --- [...] components are not restricted to their normal ranges and will be normalized, if need be. For example, consider a struct tm initialized with a tm_year = 122, tm_mon = 10, tm_mday = 30, tm_hour = 22, tm_min = 57, and a tm_sec = 0. Incrementing tm_min by 13 and calling mktime() would lead to a tm_hour = 23 and tm_min = 10. This normalizing can lead to cascading changes: Again using a struct tm initialized as in the above example but with a tm_hour = 23, the same change would lead to a tm_mon = 11, tm_mday = 1, tm_hour = 0, and tm_min = 10. Negative values may also be normalized with similar cascading effect such that e.g., a tm_hour of −1 means 1 hour before mid‐ night on the previous day and so on. --- ? -Jan
Re: CVS commit: src/lib/libc/time
Taylor R Campbell wrote in <20221023170035.2542f60...@jupiter.mumble.net>: ... |If you use a monotonic timer to sample the POSIX clock before and |after a leap second, the POSIX clock will appear to have taken twice |as long as it should to pass the leap second. Just to note that the next leap second could be a negative one. --steffen | |Der Kragenbaer,The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)
Re: CVS commit: src/lib/libc/time
Date:Sun, 23 Oct 2022 13:53:01 -0400 From:Jan Schaumann Message-ID: | Hmm, maybe something like this? I think there is still too much there, you don't have to explain everything (or almost anything), but it is in the right direction I think. | For example, consider a struct tm initialized with a | tm_year = 122, tm_mon = 11, tm_mday = 30, tm_hour = | 22, tm_min = 57, and a tm_sec = 0, using UTC, | representing 2022â12â31T22:57:00Z. That last bit (the ISO format spec of the time) isn't really needed, but doesn't hurt either - but I would avoid an example that touches anywhere near midnight Dec 31, UTC, (which this one does later, if not here) as that's exactly where leap seconds start to appear, and (as illustrated in the side-discussion in this thread) that's where things get messy. Best to just avoid that (Avoid June 30 for the same reason.) | Incrementing | tm_min by 13 and calling mktime() would yield a time_t | with value 1672528200, That's irrelevant, the time_t returned isn't what's really interesting here, and its binary value is way too much (useless) information, it is the modification made to the tm that matters. It is already clear enough in the doc (I think) that the result from mktime() is the time_t for the normalised tm. | representing 2022â12â31T23:10:00Z, this, or just give the adjusted tm_min and tm_hour values, there's no need for both, and | as the tm_min = 70 was | normalized to an increment of tm_hour by one and | tm_min = 10. no need for the explanation of how it was done. | This normalizing can lead to cascading changes: Again | using a struct tm initialized as in the above example | but with a tm_hour = 23, the same change would yield a | time_t with value 1672531800, representing | 2023â01â01T00:10:00Z That's the adjustment we want to avoid, as it gets right into what happens if we're observing leap seconds, and one was to happen in that period.There's no need to show the year being incremented, showing the month going up would be enough, readers ought to be able to deduce that if the month changes from Dec to Jan then the year would be incremented by one. | the normalization of tm_min incremented tm_hour, This explanation is not needed, but if it were, that would be correct, but | which lead to a normalization of tm_mday, which in | turn caused tm_mon to be normalized, but not those. Those fields (in this example) were already within the appropriate range, they don't need to be normalised, they're simply adjusted, or as in this (or tm_hour above): | which in turn lead to the increment of tm_year. | In addition, That's perhaps poor wording here, "addition" followed immediately by "negative", it could be "Also" which avoids this, but this lead in clause is not really needed at all, just begin like: | negative values may also be normalized, | such that e.g., a tm_hour of â1 means 1 hour before | midnight, tm_mday of 0 means the day preceding the | current month, and tm_mon of â2 means 2 months before | January of tm_year. Again, too much there, we don't need examples of everything. I still feel though that an example with more than one adjustment to a tm returned by localtime (though how it was originally created is irrelevant) and which would affect Feb 29 were it a leap year, would be worth giving (in both the leap year and non leap year cases) so we get Feb 29 in one case and Mar 1 in the other, from an adjustment that affects months (and days, either directly, or as a flow on from hours, mins, or secs) - like going back a month from Mar 28, then forward to the next day, and sometimes still being in Mar. | The fact that mktime(2) returns a time_t makes the | phrasing even more awkward, Yes, as above. | and I suppose we could | leave out the actual value and say "would yield a | time_t representing..."? No, nothing about the result returned at all, this should all just be about what normalising the tm causes to happen. Those values are not just internal, the tm passed (via ref) is modified if needed before mktime returns. kre
Re: CVS commit: src/lib/libc/time
On Sun, Oct 23, 2022 at 11:00 AM Taylor R Campbell < campbell+netbsd-source-change...@mumble.net> wrote: > > Date: Sun, 23 Oct 2022 07:39:25 -0600 > > From: Warner Losh > > > > I guess a more accurate way of saying this is that leap seconds simply > > aren't reliable, cannot be made reliable, and this affects normalization > > in ways too numerous to mention due to the details of the tz files, bugs > > in the system, and lack of others to implement them correctly. > > I think you mean `POSIX clocks simply aren't reliable'. They _could_ > be made reliable, though, by fixing the the calendar arithmetic > formula in POSIX for mapping time_t to and from UTC -- just like POSIX > already fixed the bug in its Gregorian leap year formula. > Except they can't, at least not practically enough to be a standard. The Gregorian Leap Year formula is a mathematical formula that needs no further data other than the broken down time to compute. It's not an observational calendar, but a computational or arithmetic one. UTC is an observational calendar. We barely know if there's going to be a leap second in the coming months, and have nothing more than a vague notion of when the one after that might be. You must have a table of all past leap seconds to do any kind of sensible mapping. And you also must have some way to keep that up to date, even when machines are powered off, or installed from not really that old media (anything older than 6 months can't possibly have the right leap table, except by chance). And then the question becomes how do you get it, do you assume connectivity, some standard media format, some standard file format, etc. All of these details means POSIX can't really fix this. And even if they do, the current formula has been around so long there's a lot of dusty decks of code that will likely silently break. You can ameliorate that somewhat by inventing new interfaces, but issues like the one you go into below will still persist. > > > The code works with either set of tzdata files, POSIX stretchy secs, > > > or UTC with leap secs - claiming that one doesn't happen, or cannot > > > happen, isn't really correct. > > > > Yea, and even 'posix stretchy sec' is really a misnomer. POSIX simply > > counts time without leap seconds. Each second is the same length, > > Not at all. > > If you use a monotonic timer to sample the POSIX clock before and > after a leap second, the POSIX clock will appear to have taken twice > as long as it should to pass the leap second. > > Of course, it's worse. If sampled at _subsecond_ intervals, a POSIX > clock behaves _erratically_: it spontaneously rewinds itself! > > Suppose we have a machine with a monotonic clock that counts SI > seconds as well as a POSIX clock: > > SI monotonicPOSIX > 123.25 1483228799.00 > 123.50 1483228799.25 > 123.75 1483228799.50 # t0 = boottime + 123.75 > 124.00 1483228799.75 > 124.25 1483228800.00 # leap second begins at 2016-12-31T23:59:60Z > 124.50 1483228800.25 > 124.75 1483228800.50 > 125.00 1483228800.75 # t1 = boottime + 125.00 > 125.25 1483228800.00 # POSIX clock rewinds at > 2017-01-01T00:00:00Z! > 125.50 1483228800.25 > 125.75 1483228800.50 # t2 = boottime + 125.75 > 126.00 1483228800.75 > > At supersecond resolution, t2 - t0 is a duration of 2 SI seconds, but > a POSIX clock reports a time difference POSIX(t2) - POSIX(t0) of 1, so > `POSIX seconds' are not always SI seconds -- it is not the case that > `each [POSIX] second is the same length', even ignoring physical clock > sampling error. > Right. Except during that brief interval around a leap second, all the seconds are the same size. They aren't expanded or contracted by running the clock at a different frequency that was done between approx 1960-1972 which is often referred to as the rubber leap second era. It was more on that basis that I was objecting to the turn of phrase because that sort of thing isn't happening. > And, of course, at subsecond resolution, the POSIX clock rewinds. The > monotonic clock correctly has t1 < t2, but POSIX(t1) > POSIX(t2). And > this erratic behaviour is much worse than a typical NTP-driven clock > adjustment at random times, because by design this erratic behaviour > happens on ~every computer on the planet simultaneously! > Yea, if NTP knows about the leap, it can deal with it. The problem as you say comes in when the stratum 1 servers don't announce the leap second far enough in advance for the implementations to cope. It then devolves to the 'when, exactly, do you step the time back' problem since there's a couple of choices, unless you have the 'leap smear' ntp servers which do it over a few hours. > There's no need for this nonsense except insistence on the formula > that says every UTC day is counted by 86400 `POSIX seconds'. POSIX > could be revised to fix this bug in the clock by just not doing civil > calendar
Re: CVS commit: src/lib/libc/time
> Date: Sun, 23 Oct 2022 12:19:37 -0600 > From: Warner Losh > > You must have a table of all past > leap seconds to do any kind of sensible mapping. And you also must > have some way to keep that up to date, [...] It's the same for all civil calendar matters with political changes around summer time, but at a smaller scale: with an outdated tzdb, where localtime might be off for an hour, mktime might be off by a few seconds. > > And, of course, at subsecond resolution, the POSIX clock rewinds. The > > monotonic clock correctly has t1 < t2, but POSIX(t1) > POSIX(t2). And > > this erratic behaviour is much worse than a typical NTP-driven clock > > adjustment at random times, because by design this erratic behaviour > > happens on ~every computer on the planet simultaneously! > > Yea, if NTP knows about the leap, it can deal with it. The problem as you > say comes in when the stratum 1 servers don't announce the leap second > far enough in advance for the implementations to cope. It then devolves to > the 'when, exactly, do you step the time back' problem since there's a > couple > of choices, unless you have the 'leap smear' ntp servers which do it over > a few hours. I mentioned NTP-driven clock adjustments as an example of _other_ times that a POSIX clock might rewind. Such adjustments arise from local clock drift and so occur randomly, in contrast to the rewinding on leap seconds that is globally coordinated as a deliberate design decision of POSIX leading to simultaneous worldwide computer system failures. NTP doesn't have to know about leap seconds at all, nor does POSIX. They could both just report the number of SI seconds (averaged in the usual TAI way) since an appropriate epoch, and all the leap second adjustment logic could be ripped out and moved to tzdb and mktime just like time zones and summer time. The fundamental problem with a POSIX clock -- and to a lesser extent NTP, although NTP at least transmits additional information to work around it during a leap second transition unlike POSIX programs -- is that it goes out of its way to step back the _internal concept of timecounting_ (tick-tick-tick, counting what everyone expects to be SI seconds) just to deal with an _external civil calendar presentation_, like the summer time change. And it makes sure to do so simultaneously on all computers worldwide. That's a spectacularly bad design leading to tremendously costly mistakes. Leap seconds aren't a big deal; the erratic design of POSIX clocks is.
Re: CVS commit: src/lib/libc/time
Robert Elz wrote: > Date:Sat, 22 Oct 2022 13:42:54 -0400 > From:Jan Schaumann > Message-ID: > > | A full set of examples strikes me as too much here > > A full set would be yes, but I think we could handle 3. Hmm, maybe something like this? ...and will be normalized, if need be. For example, consider a struct tm initialized with a tm_year = 122, tm_mon = 11, tm_mday = 30, tm_hour = 22, tm_min = 57, and a tm_sec = 0, using UTC, representing 2022‐12‐31T22:57:00Z. Incrementing tm_min by 13 and calling mktime() would yield a time_t with value 1672528200, representing 2022‐12‐31T23:10:00Z, as the tm_min = 70 was normalized to an increment of tm_hour by one and tm_min = 10. This normalizing can lead to cascading changes: Again using a struct tm initialized as in the above example but with a tm_hour = 23, the same change would yield a time_t with value 1672531800, representing 2023‐01‐01T00:10:00Z: the normalization of tm_min incremented tm_hour, which lead to a normalization of tm_mday, which in turn caused tm_mon to be normalized, which in turn lead to the increment of tm_year. In addition, negative values may also be normalized, such that e.g., a tm_hour of −1 means 1 hour before midnight, tm_mday of 0 means the day preceding the current month, and tm_mon of −2 means 2 months before January of tm_year. --- The fact that mktime(2) returns a time_t makes the phrasing even more awkward, and I suppose we could leave out the actual value and say "would yield a time_t representing..."? Other ideas to improve this? -Jan
Re: CVS commit: src/lib/libc/time
> Date: Sun, 23 Oct 2022 07:39:25 -0600 > From: Warner Losh > > I guess a more accurate way of saying this is that leap seconds simply > aren't reliable, cannot be made reliable, and this affects normalization > in ways too numerous to mention due to the details of the tz files, bugs > in the system, and lack of others to implement them correctly. I think you mean `POSIX clocks simply aren't reliable'. They _could_ be made reliable, though, by fixing the the calendar arithmetic formula in POSIX for mapping time_t to and from UTC -- just like POSIX already fixed the bug in its Gregorian leap year formula. > > The code works with either set of tzdata files, POSIX stretchy secs, > > or UTC with leap secs - claiming that one doesn't happen, or cannot > > happen, isn't really correct. > > Yea, and even 'posix stretchy sec' is really a misnomer. POSIX simply > counts time without leap seconds. Each second is the same length, Not at all. If you use a monotonic timer to sample the POSIX clock before and after a leap second, the POSIX clock will appear to have taken twice as long as it should to pass the leap second. Of course, it's worse. If sampled at _subsecond_ intervals, a POSIX clock behaves _erratically_: it spontaneously rewinds itself! Suppose we have a machine with a monotonic clock that counts SI seconds as well as a POSIX clock: SI monotonicPOSIX 123.25 1483228799.00 123.50 1483228799.25 123.75 1483228799.50 # t0 = boottime + 123.75 124.00 1483228799.75 124.25 1483228800.00 # leap second begins at 2016-12-31T23:59:60Z 124.50 1483228800.25 124.75 1483228800.50 125.00 1483228800.75 # t1 = boottime + 125.00 125.25 1483228800.00 # POSIX clock rewinds at 2017-01-01T00:00:00Z! 125.50 1483228800.25 125.75 1483228800.50 # t2 = boottime + 125.75 126.00 1483228800.75 At supersecond resolution, t2 - t0 is a duration of 2 SI seconds, but a POSIX clock reports a time difference POSIX(t2) - POSIX(t0) of 1, so `POSIX seconds' are not always SI seconds -- it is not the case that `each [POSIX] second is the same length', even ignoring physical clock sampling error. And, of course, at subsecond resolution, the POSIX clock rewinds. The monotonic clock correctly has t1 < t2, but POSIX(t1) > POSIX(t2). And this erratic behaviour is much worse than a typical NTP-driven clock adjustment at random times, because by design this erratic behaviour happens on ~every computer on the planet simultaneously! There's no need for this nonsense except insistence on the formula that says every UTC day is counted by 86400 `POSIX seconds'. POSIX could be revised to fix this bug in the clock by just not doing civil calendar adjustments in the basic clock that goes tick-tick-tick for counting what most people think are going to be SI seconds. For the time_t<->UTC conversion in libc, machines with out-of-date tzdb would just be off by a few seconds sometimes, no worse than being off by an hour in the time_t<->localtime conversion with an out-of-date tzdb across an updated summer time change.
Re: CVS commit: src/lib/libc/time
On Sat, Oct 22, 2022 at 10:55 PM Robert Elz wrote: > Date:Sat, 22 Oct 2022 20:17:57 -0600 > From:Warner Losh > Message-ID: < > canczdfqhkz0xs7lf6lmjveontc5dgsonons_ug6fzsf30og...@mail.gmail.com> > > > | On the other hand, adding a caveat that leap seconds don't exist in a > POSIX > | time_t and that's necessarily reflected in the normalization wouldn't > be > | bad. > > The problem with doing that is that while we don't install it this > way, it is possible to install tzdata such that leap seconds do occur > (despite that not being posix) either unconditionally, or installing > both and leaving it up to the user by their choice of timezone to use > (which makes rather a mess). > That's true. In the default "UTC system time" clock, leap seconds simply do no work at all. With the tzdata with leap seconds, they work for a limited subset of things, but not quite everything because you can't completely close the gap due to time_t not having an encoding for them. Having tried to deploy systems that needed to present a pedantically-correct UTC time to the outside world shows many hole (most of which self correct and are mostly right not near leap seconds). I guess a more accurate way of saying this is that leap seconds simply aren't reliable, cannot be made reliable, and this affects normalization in ways too numerous to mention due to the details of the tz files, bugs in the system, and lack of others to implement them correctly. > The code works with either set of tzdata files, POSIX stretchy secs, > or UTC with leap secs - claiming that one doesn't happen, or cannot > happen, isn't really correct. > Yea, and even 'posix stretchy sec' is really a misnomer. POSIX simply counts time without leap seconds. Each second is the same length, and it doesn't make them longer or shorter. It just assumes some ad-hoc, beyond the standard method for synchronization. The quality of these ad-hoc methods ranges from terrible to halfway-decent, but none are perfect. OK. I've got my annual leap seconds rant out of the way. Even 15 years after shipping these systems, the almost (but not completely) right nature of how computers implement them still has my knee jerking just a little too much... Warner
Re: CVS commit: src/lib/libc/time
On Sat, Oct 22, 2022, 8:05 PM Robert Elz wrote: > Date:Sat, 22 Oct 2022 13:42:54 -0400 > From:Jan Schaumann > Message-ID: > > | A full set of examples strikes me as too much here > > A full set would be yes, but I think we could handle 3. > They don't need to be C code examples, just something > like > if tm_year==2022 tm_mon==10 tm_mday==19 >tm_hour==23 tm_min==58 tm_sec==17 and tm_dst==-1 > > and tm_min was incremented as tm_min+=180, and then > mktime() called upon the result, the struct tm would > now have tm_min==1 tm_hour==24 and tm_mday==20, with > tm_isdst 0 or 1 depending upon the local time zone. > The other fields shown would not be altered in this > example, the fields of struct tm not show here would > (tm_wday, ...) would be filled in with appropriate values. > That sounds like a good idea... [Using tm_min rather than tm_sec to avoid needing to include > caveats about possible leap seconds, which, at least currently, > could never happen in November, but all of that is too much > to bother with.] > On the other hand, adding a caveat that leap seconds don't exist in a POSIX time_t and that's necessarily reflected in the normalization wouldn't be bad. It's a quite common misconception that somehow they do because they exist in UTC... while you can encode any valid UTC time, they are always normalized away. The other examples could be even shorter, no need to repeat > the final sentence about the other field changes or otherwise. > > | Perhaps: > | > | This normalization is done in order from tm_sec up to > | tm_year, > > No, that is what you cannot say, because it is wrong. > > We do not always start from tm_sec, and tm_mon is *always* > normalised (with its possible flow on to tm_year) before > tm_mday - nothing else makes sense. tm_year isn't > "normalized" at all, though it might be adjusted. > > But all of this is what we already agreed is too much > to explain. And no matter how simple it makes things, > the man pages cannot (deliberately) lie. > Agreed. These nuances are important. | possibly leading to cascading values. That > | is, a tm_sec value of e.g., 185 with a tm_min value of > | 58 could lead to an increment of tm_min by three, and > | thus further incrementing tm_hour by one, and so on. > > This is more explaining how it works, which if we do at all, > we need to do correctly. That's where a few examples can > be better, they allow the reader to deduce what probably > happens, the weird one is there to show that the results > will not always be what is expected at first glance, but > that they always are correct. > > | As with most things relating to > | time, dates, and calendars, the full details of these > | side effects are often non‐obvious, and it may be best > | to avoid such scenarios. > > And that part is just useless, unless you mean that non-normalised > values should never be used, perhaps just in some fields (and it > isn't that anyway, a big enough increment of tm_sec is the same > as an increment of tm_mday or tm_mon) in which case we should just > say that. If not that, then hinting that it is sometimes odd, > and those cases, which ones we refuse to tell you, should be > avoided is a complete waste of time. Which scenarios should be > avoided? The answer really is none of them, the answer is > always correct, it just might not be what the user expected. > All good points. Warner > kre > >
Re: CVS commit: src/lib/libc/time
Date:Sat, 22 Oct 2022 20:17:57 -0600 From:Warner Losh Message-ID: | On the other hand, adding a caveat that leap seconds don't exist in a POSIX | time_t and that's necessarily reflected in the normalization wouldn't be | bad. The problem with doing that is that while we don't install it this way, it is possible to install tzdata such that leap seconds do occur (despite that not being posix) either unconditionally, or installing both and leaving it up to the user by their choice of timezone to use (which makes rather a mess). The code works with either set of tzdata files, POSIX stretchy secs, or UTC with leap secs - claiming that one doesn't happen, or cannot happen, isn't really correct. kre
Re: CVS commit: src/lib/libc/time
Date:Sun, 23 Oct 2022 08:33:18 +0700 From:Robert Elz Message-ID: <7638.1666488...@jacaranda.noi.kre.to> | and tm_min was incremented as tm_min+=180, and then | mktime() called upon the result, the struct tm would | now have tm_min==1 tm_hour==24 and tm_mday==20, with That is nonsense of course, but I think you get the idea (and being nonsense shows why the examples should be tested ) kre
Re: CVS commit: src/lib/libc/time
Date:Sat, 22 Oct 2022 13:42:54 -0400 From:Jan Schaumann Message-ID: | A full set of examples strikes me as too much here A full set would be yes, but I think we could handle 3. They don't need to be C code examples, just something like if tm_year==2022 tm_mon==10 tm_mday==19 tm_hour==23 tm_min==58 tm_sec==17 and tm_dst==-1 and tm_min was incremented as tm_min+=180, and then mktime() called upon the result, the struct tm would now have tm_min==1 tm_hour==24 and tm_mday==20, with tm_isdst 0 or 1 depending upon the local time zone. The other fields shown would not be altered in this example, the fields of struct tm not show here would (tm_wday, ...) would be filled in with appropriate values. [Using tm_min rather than tm_sec to avoid needing to include caveats about possible leap seconds, which, at least currently, could never happen in November, but all of that is too much to bother with.] The other examples could be even shorter, no need to repeat the final sentence about the other field changes or otherwise. | Perhaps: | | This normalization is done in order from tm_sec up to | tm_year, No, that is what you cannot say, because it is wrong. We do not always start from tm_sec, and tm_mon is *always* normalised (with its possible flow on to tm_year) before tm_mday - nothing else makes sense. tm_year isn't "normalized" at all, though it might be adjusted. But all of this is what we already agreed is too much to explain. And no matter how simple it makes things, the man pages cannot (deliberately) lie. | possibly leading to cascading values. That | is, a tm_sec value of e.g., 185 with a tm_min value of | 58 could lead to an increment of tm_min by three, and | thus further incrementing tm_hour by one, and so on. This is more explaining how it works, which if we do at all, we need to do correctly. That's where a few examples can be better, they allow the reader to deduce what probably happens, the weird one is there to show that the results will not always be what is expected at first glance, but that they always are correct. | As with most things relating to | time, dates, and calendars, the full details of these | side effects are often nonâobvious, and it may be best | to avoid such scenarios. And that part is just useless, unless you mean that non-normalised values should never be used, perhaps just in some fields (and it isn't that anyway, a big enough increment of tm_sec is the same as an increment of tm_mday or tm_mon) in which case we should just say that. If not that, then hinting that it is sometimes odd, and those cases, which ones we refuse to tell you, should be avoided is a complete waste of time. Which scenarios should be avoided? The answer really is none of them, the answer is always correct, it just might not be what the user expected. kre
Re: CVS commit: src/lib/libc/time
Robert Elz wrote: > jo...@bec.de said: > | I'd actually weasle out a bit > > Yes, I would too, but A full set of examples strikes me as too much here still, and I do like the idea of weaseling out. Perhaps: This normalization is done in order from tm_sec up to tm_year, possibly leading to cascading values. That is, a tm_sec value of e.g., 185 with a tm_min value of 58 could lead to an increment of tm_min by three, and thus further incrementing tm_hour by one, and so on. Likewise, negative values can lead to decrementing other tm(3) fields. As with most things relating to time, dates, and calendars, the full details of these side effects are often non‐obvious, and it may be best to avoid such scenarios. ? -Jan
Re: CVS commit: src/lib/libc/time
Date:Fri, 21 Oct 2022 15:00:41 -0400 From:Jan Schaumann Message-ID: | I believe that it's useful for people to know that | values outside the normal ranges are normalized, Agreed. | as well as what that might look like. The problem with trying to specify that, as we have discovered, is that it is not easy. The problem with only doing half of the job here, is that sometime later someone will read it, see what it says happens, and what it doesn't say, and then expect the code to do that - when it probably doesn't. I'll come back to the rest of your message a bit lower, but first a brief interlude ... jo...@bec.de said: | I'd actually weasle out a bit Yes, I would too, but jo...@bec.de continued: | and just declare that normalisation is best effort and we | don't guarantee behavior for values that are very much outside | the range of the corresponding field. E.g. anything where | numerical overflow can happen in a component. that isn't really the problem. tzcode is very careful with arithmetic to avoid those kinds of problems - if there's anywhere that it isn't (which is still possible) that would be fixed if pointed out. The problem is just the complexity of specifying what does happen, in a way that fits the manual page in a way that is easier to follow that simply reading the source (which isn't all that easy). Now we return to the originally scheduelled e-mail: jscha...@netmeister.org said: | That helps a lot when trying to determine _why_ the | output of mktime(3) [...] | Noting that values may cascade is useful [...] I wonder if rather than attempting to specify what happens, it might be better in this case to give examples of using struct tm alterations, followed by mktime(), to alter the time, and what the results are. I'm not generally much in favour of examples in man pages, I prefer specification, and leave it up to the reader to determine how to use things to achieve the desired result, as while giving an example or two (or three) of what can be done is great for people who want to do one of those exact things (and so can just copy the example) but is useless for someone whose needs are different than any of the examples, and is left trying to guess what will happen. But here a two or three good examples of what happens might just be a sane solution, perhaps two fairly simple ones, but which demonstrate what happens with addition, and subtraction - using just one field modification, but having that result in a cascade through a few of the other fields - perhaps the 180 seconds forward, starting at 23:58 on the last day of some month, and 32 hours backwards starting at 01:15 on the second of a month.Those should both produce entirely reasonable and understandable results. The third should show multiple modifications, in a way that produces, or can do, a surprising result, perhaps 14 days forward, also 2 months forward, and some irrelevant small number of minutes backward (that just to show that multiple fields can be altered, and they don't need to all go in the same direction) starting at 13:27 on Dec 15 - where the result can be Feb 29, or Mar 1 (at something a bit earlier than 13:27, perhaps even make the number of minutes subtracted > 27 in that example) depending upon whether the year is a leap year or not. (Or make it 6 months forward (plus...) starting from August). This one should be one full example, with something like "if the starting year had been then the result would have been" added to show the difference between leap years and others. If this is done, the examples should all be tested to verify the claimed results, but the man page should not need a lot of explanation of why the result shown is achieved, just the input, the modification made, and the result. kre
Re: CVS commit: src/lib/libc/time
Am Fri, Oct 21, 2022 at 02:06:32PM +0700 schrieb Robert Elz: > Date:Fri, 21 Oct 2022 03:05:15 + > From:"Jan Schaumann" > Message-ID: <20221021030515.cdc9df...@cvs.netbsd.org> > > | Note normalizing behavior of mktime(3) using language from FreeBSD. > > If we are going to start specifying what happens (how the struct tm > is normalised) then we really also need to specify in which order > the fields are corrected. It makes a difference. I'd actually weasle out a bit and just declare that normalisation is best effort and we don't guarantee behavior for values that are very much outside the range of the corresponding field. E.g. anything where numerical overflow can happen in a component. Joerg
Re: CVS commit: src/lib/libc/time
Robert Elz wrote: > Date:Fri, 21 Oct 2022 11:54:08 -0400 > From:Jan Schaumann > Message-ID: > > | Right, but all that strikes me as too much to put into > | the manual page here. > > I agree, which is why I wondered if we should be giving any > details about how it is done at all. I believe that it's useful for people to know that values outside the normal ranges are normalized, as well as what that might look like. That helps a lot when trying to determine _why_ the output of mktime(3) with a tm_sec = 180 yields no error and a value of tm_min += 3. I don't think many people would want to go and dig up the sources to make sense of this. Noting that values may cascade is useful (because that, too, is not obvious), but seems to me the limit of detail desired here, while still being useful. -Jan
Re: CVS commit: src/lib/libc/time
Date:Fri, 21 Oct 2022 11:54:08 -0400 From:Jan Schaumann Message-ID: | Right, but all that strikes me as too much to put into | the manual page here. I agree, which is why I wondered if we should be giving any details about how it is done at all. | I think what matters in this context is that there is | an order and that there is a cascading effect. Perhaps, but | "This normalization is done in order from tm_sec to | tm_mon (inclusive), possibly leading to cascading | values." As a generic order, that's backwards. Really we need to go from year (which is always OK) to month (which can always be corrected simply, assuming the Gregorian calendar is in effect, which we assume it is, even for years before that calendar was created). Once we know the year and month, but not before, the day of month can be corrected (which might then affect the month and year, but only once). The code doesn't do it in quite that order, it does the secs mins hours first, but doing it that way ensures that the hours correction cannot affect the day, after it has been corrected, the day(of month) always gets set last (leap seconds excepted). It really is moderately complicated - trying to explain it at all risks being simply wrong, or so vague as to be completely useless.I'm not sure that what was there before (no details at all, just (in other words) "it gets fixed") wasn't as good as is reasonable to expect. kre
Re: CVS commit: src/lib/libc/time
Robert Elz wrote: > Date:Fri, 21 Oct 2022 10:36:23 -0400 > From:Jan Schaumann > Message-ID: > > > | Perhaps like this? > > Like that, yes, but as you wrote it isn't how it is actually > done I believe. > > The order looks to be secs, mins, hours, month, day(of month). > > There's no normalisation of the year (no invalid values to correct) > though the year can be adjusted when the month is adjusted. > > That order is assuming that the system is using posix stretchy seconds, > rather than UTC (with leap seconds), but almost all NetBSD systems > work that way - if UTC is being used, then secs cannot be done first, > as until the rest of the data/time is known, it is unknown whether > the number of seconds permitted is 59 60 or 61. With POSIX seconds > the secs/mins/hours correction order doesn't really matter, as there > are always 0..59 secs, 0..59 mins, and 0..23 hours, the rest of the > fields have no impact at all. Right, but all that strikes me as too much to put into the manual page here. I think what matters in this context is that there is an order and that there is a cascading effect. "This normalization is done in order from tm_sec to tm_mon (inclusive), possibly leading to cascading values." or "This normalization is done in order from tm_sec up to tm_year, possibly leading to cascading values." provides sufficient information, I think? -Jan
Re: CVS commit: src/lib/libc/time
Date:Fri, 21 Oct 2022 10:36:23 -0400 From:Jan Schaumann Message-ID: | Perhaps like this? Like that, yes, but as you wrote it isn't how it is actually done I believe. The order looks to be secs, mins, hours, month, day(of month). There's no normalisation of the year (no invalid values to correct) though the year can be adjusted when the month is adjusted. That order is assuming that the system is using posix stretchy seconds, rather than UTC (with leap seconds), but almost all NetBSD systems work that way - if UTC is being used, then secs cannot be done first, as until the rest of the data/time is known, it is unknown whether the number of seconds permitted is 59 60 or 61. With POSIX seconds the secs/mins/hours correction order doesn't really matter, as there are always 0..59 secs, 0..59 mins, and 0..23 hours, the rest of the fields have no impact at all. kre
Re: CVS commit: src/lib/libc/time
Robert Elz wrote: > Date:Fri, 21 Oct 2022 03:05:15 + > From:"Jan Schaumann" > Message-ID: <20221021030515.cdc9df...@cvs.netbsd.org> > > | Note normalizing behavior of mktime(3) using language from FreeBSD. > > If we are going to start specifying what happens (how the struct tm > is normalised) then we really also need to specify in which order > the fields are corrected. It makes a difference. Perhaps like this? Index: ctime.3 === RCS file: /cvsroot/src/lib/libc/time/ctime.3,v retrieving revision 1.65 diff -u -p -r1.65 ctime.3 --- ctime.3 21 Oct 2022 03:08:29 - 1.65 +++ ctime.3 21 Oct 2022 14:34:39 - @@ -264,6 +264,11 @@ of 0 means the day preceding the current .Fa tm_mon of \-2 means 2 months before January of .Fa tm_year . +This normalization is done in order from +.Fa tm_sec +to +.Fa tm_year , +possibly leading to cascading values. (A positive or zero value for .Fa tm_isdst causes
Re: CVS commit: src/lib/libc/time
Date:Fri, 21 Oct 2022 03:05:15 + From:"Jan Schaumann" Message-ID: <20221021030515.cdc9df...@cvs.netbsd.org> | Note normalizing behavior of mktime(3) using language from FreeBSD. If we are going to start specifying what happens (how the struct tm is normalised) then we really also need to specify in which order the fields are corrected. It makes a difference. kre
Re: CVS commit: src/lib/libc/stdlib
Date:Thu, 29 Sep 2022 08:18:49 -0400 From:Christos Zoulas Message-ID: | Yes, I had forgotten about the need to do this explicitly... | > On Sep 28, 2022, at 10:23 PM, Robert Elz wrote: | > | > Apologies, I did not read the code closely enough, there must be a bug | > in the way the clone file descriptor is created. I think I know the cause now, and it isn't anything specific to /dev/ptmx it would affect all cloning devices. In kern/vfs_syscalls.c do_open() we see the following: if (vp == NULL) { fd_abort(p, fp, indx); error = fd_dupopen(dupfd, dupfd_move, flags, ); if (error) return error; *fd = indx; } else { error = open_setfp(l, fp, vp, indx, flags); if (error) return error; VOP_UNLOCK(vp); *fd = indx; fd_affix(p, fp, indx); } The vp==NULL case is where cloning devices are handled. fd_dupopen() arranges top make the duplicate happen. It attempts to handle O_CLOEXEC thus: error = fd_dup(fp, 0, newp, ff->ff_exclose); where ff_exclose is the close_on_exec bit for the old file descriptor (the one being duplicated) which is just fine for other calls of fd_dupopen(). The vp!=NULL case above is for "normal" opens, and open_setfp() is what handles O_CLOEXEC - that's what (eventually) makes ff_exclose be true (it also handles O_EXLOCK and O_SHLOCK. None of that is being done in the cloning device open case, so those 3 flags simply cannot work, as the code is written currently, for these devices. I'm not about to go playing in this very fiddly piece of code, but I'm kind of hoping that dholland@ might know the right magic sequence to make this issue go away. It probably means a call to open_setfp() with some parameters or other, somewhere in the vp==NULL side of that if statement, though, just what I have not attempted to work out. kre
Re: CVS commit: src/lib/libc/stdlib
Yes, I had forgotten about the need to do this explicitly... christos > On Sep 28, 2022, at 10:23 PM, Robert Elz wrote: > > Apologies, I did not read the code closely enough, there must be a bug > in the way the clone file descriptor is created. > > kre signature.asc Description: Message signed with OpenPGP
Re: CVS commit: src/lib/libc/stdlib
Apologies, I did not read the code closely enough, there must be a bug in the way the clone file descriptor is created. kre
Re: CVS commit: src/lib/libc/stdlib
Date:Wed, 28 Sep 2022 20:48:53 -0400 From:"David H. Gutteridge" Message-ID: <9c9c8e9d9384338320b47dabfdc94...@gutteridge.ca> | (O_CLOEXEC, on the other hand, is ignored, at the moment.) No it isn't, your test is faulty, O_CLOEXEC is a different kind of flag, applies at a different level, and is fetched a different way. That's what dholland@ tried to tell you a few days ago. kre | | $ cat open_test.c | #include | #include | #include | | int main(int argc, char* argv[]) | { | int fd, flags; | | printf("Regular file (read-only) with O_NONBLOCK | O_CLOEXEC.\n"); | if ((fd = open("/etc/release", O_RDONLY | O_NONBLOCK | O_CLOEXEC)) < 0) | printf("Failed to get file handle.\n"); | else { | printf("Descriptor flags: %d\n", flags = fcntl(fd, F_GETFD)); | printf("Status flags: %d\n", flags = fcntl(fd, F_GETFL, 0)); | close(fd); | } | | printf("POSIX pt cloning device with O_NONBLOCK | O_CLOEXEC.\n"); | if ((fd = open("/dev/ptmx", O_RDWR | O_NOCTTY | O_NONBLOCK | | O_CLOEXEC)) < 0) | printf("Failed to open /dev/ptmx.\n"); | else { | printf("Descriptor flags: %d\n", flags = fcntl(fd, F_GETFD)); | printf("Status flags: %d\n", flags = fcntl(fd, F_GETFL, 0)); | close(fd); | } | | return 0; | } | | $ ./open_test | Regular file (read-only) with O_NONBLOCK | O_CLOEXEC. | Descriptor flags: 1 | Status flags: 4 | POSIX pt cloning device with O_NONBLOCK | O_CLOEXEC. | Descriptor flags: 0 | Status flags: 6 | | Regards, | | Dave |
Re: CVS commit: src/lib/libc/stdlib
On Wed, 28 Sep 2022, at 15:07:41 - (UTC), Christos Zoulas wrote: In article <20220928003547.D2375FA92%cvs.NetBSD.org@localhost>, David H. Gutteridge wrote: -=-=-=-=-=- Module Name:src Committed By: gutteridge Date: Wed Sep 28 00:35:47 UTC 2022 Modified Files: src/lib/libc/stdlib: posix_openpt.3 Log Message: posix_openpt.3: reflect flag changes from r. 1.44 of tty_ptm.c Some flags are now accepted, others are still ignored. (E.g., other BSDs would return EINVAL if O_RDWR wasn't passed, and we now accept O_NONBLOCK but not O_CLOEXEC.) How so? #define FCNTLFLAGS (FAPPEND|FASYNC|FFSYNC|FNONBLOCK|FDSYNC|FRSYNC|FALTIO|\ ^ FDIRECT|FNOSIGPIPE) /* bits to save after open(2) */ #define FMASK (FREAD|FWRITE|FCNTLFLAGS|FEXEC) ^^ Best, christos Hi Christos, I'm sorry, I don't follow your question? I wrote "we now accept O_NONBLOCK...", which is what I'm reading you're saying too? (O_CLOEXEC, on the other hand, is ignored, at the moment.) $ cat open_test.c #include #include #include int main(int argc, char* argv[]) { int fd, flags; printf("Regular file (read-only) with O_NONBLOCK | O_CLOEXEC.\n"); if ((fd = open("/etc/release", O_RDONLY | O_NONBLOCK | O_CLOEXEC)) < 0) printf("Failed to get file handle.\n"); else { printf("Descriptor flags: %d\n", flags = fcntl(fd, F_GETFD)); printf("Status flags: %d\n", flags = fcntl(fd, F_GETFL, 0)); close(fd); } printf("POSIX pt cloning device with O_NONBLOCK | O_CLOEXEC.\n"); if ((fd = open("/dev/ptmx", O_RDWR | O_NOCTTY | O_NONBLOCK | O_CLOEXEC)) < 0) printf("Failed to open /dev/ptmx.\n"); else { printf("Descriptor flags: %d\n", flags = fcntl(fd, F_GETFD)); printf("Status flags: %d\n", flags = fcntl(fd, F_GETFL, 0)); close(fd); } return 0; } $ ./open_test Regular file (read-only) with O_NONBLOCK | O_CLOEXEC. Descriptor flags: 1 Status flags: 4 POSIX pt cloning device with O_NONBLOCK | O_CLOEXEC. Descriptor flags: 0 Status flags: 6 Regards, Dave
Re: CVS commit: src/lib/libc/stdlib
In article <20220928003547.d2375f...@cvs.netbsd.org>, David H. Gutteridge wrote: >-=-=-=-=-=- > >Module Name: src >Committed By: gutteridge >Date: Wed Sep 28 00:35:47 UTC 2022 > >Modified Files: > src/lib/libc/stdlib: posix_openpt.3 > >Log Message: >posix_openpt.3: reflect flag changes from r. 1.44 of tty_ptm.c > >Some flags are now accepted, others are still ignored. (E.g., other >BSDs would return EINVAL if O_RDWR wasn't passed, and we now accept >O_NONBLOCK but not O_CLOEXEC.) How so? #define FCNTLFLAGS (FAPPEND|FASYNC|FFSYNC|FNONBLOCK|FDSYNC|FRSYNC|FALTIO|\ ^ FDIRECT|FNOSIGPIPE) /* bits to save after open(2) */ #define FMASK (FREAD|FWRITE|FCNTLFLAGS|FEXEC) ^^ Best, christos
Re: null-terminated vs. nul-terminated (was: Re: CVS commit: src/lib/libc/gen)
> On Mar 26, 2022, at 9:39 AM, Taylor R Campbell > wrote: > > `C string' is ambiguous because there are also char arrays that > function as strings but which are not guaranteed to be NUL-terminated, > as strncpy is intended for. A non-terminated char array is not a C-string. The term C-string is not ambiguous. This is something that, amazingly, even Internet trolls appear to agree on. However, they do disagree as to the spelling of the terminating character's name, which is why I think it's best to elide it altogether. -- thorpej
Re: null-terminated vs. nul-terminated (was: Re: CVS commit: src/lib/libc/gen)
> On Mar 26, 2022, at 9:09 AM, Warner Losh wrote: > > Since all the 'C' standards[*] use "null-terminated" and "null character", > it's likely best to use that terminology because there is a source of truth > for its definition in case of ambiguity or doubt. Ah, but you're giving up the opportunity to use indirection to solve the problem. By calling it a "C-string", then those who care what the standard calls the terminating character can go look it up! :-) -- thorpej
Re: null-terminated vs. nul-terminated (was: Re: CVS commit: src/lib/libc/gen)
On Sat, Mar 26, 2022 at 9:53 AM Roland Illig wrote: > Am 24.03.2022 um 02:55 schrieb David H. Gutteridge: > > Module Name: src > > Committed By: gutteridge > > Date: Thu Mar 24 01:55:15 UTC 2022 > > > > Modified Files: > > src/lib/libc/gen: popen.3 > > > > Log Message: > > popen.3: minor spelling, grammar, style, and xref tweaks > > > > > > To generate a diff of this commit: > > cvs rdiff -u -r1.22 -r1.23 src/lib/libc/gen/popen.3 > > The term "null-terminated string" is quite common when talking about C. > In contrast, the word "nul" in "nul-terminated" always reminds me of > the character abbreviation in ASCII, which has a narrower scope than C. > I prefer to keep "null-terminated" here. > The standard uses "null-terminated" and "null character" (see Character Sets section 5.2.1 (from the C2x draft, but this term dates back to C89): "A byte with all bits set to 0, called the null character, shall exist in the basic execution character set; it is used to terminate a character string." I couldn't find the definition for null-terminated though. This is different than the NULL #define Not to be confused with the all zeros ASCII charater, whose mnemonic is NUL, which is where some pressure to use NUL terminated comes from. I agree that it's usage is narrower and really only relevant for certain ASCII and ASCII-derived character sets, which is why the standard chose the spelling it did. Since all the 'C' standards[*] use "null-terminated" and "null character", it's likely best to use that terminology because there is a source of truth for its definition in case of ambiguity or doubt. Warner [*] I've not gone the extra mile and checked to see if K used this phrase, to be honest.
Re: null-terminated vs. nul-terminated (was: Re: CVS commit: src/lib/libc/gen)
> Date: Sat, 26 Mar 2022 16:53:19 +0100 > From: Roland Illig > > The term "null-terminated string" is quite common when talking about C. > In contrast, the word "nul" in "nul-terminated" always reminds me of > the character abbreviation in ASCII, which has a narrower scope than C. > I prefer to keep "null-terminated" here. I feel like I've usually seen it as NUL-terminated. I thought it was in /usr/share/misc/style but I must have been thinking of a different style guide. `NUL' is better than `null' or `NULL' here because it's not a null pointer, unlike, e.g., the execve argv terminator. Even if the string isn't US-ASCII, what character encoding calls a nonzero byte `NUL'? `NUL' is better than `zero' or `0' here because it's unambiguously the all-bits-zero byte, not the US-ASCII encoding of `0' (i.e., decimal 48 or 0x30). `C string' is ambiguous because there are also char arrays that function as strings but which are not guaranteed to be NUL-terminated, as strncpy is intended for.
Re: null-terminated vs. nul-terminated (was: Re: CVS commit: src/lib/libc/gen)
> On Mar 26, 2022, at 9:17 AM, Martin Husemann wrote: > When talking about it I prefer "zero terminated", or C-string, in > contrast to C++ std::string (which are objects) or Pascal strings > (which have an explicit length at the beginning). Yes, I also prefer the term “C-string" -- thorpej
Re: null-terminated vs. nul-terminated (was: Re: CVS commit: src/lib/libc/gen)
On Sat, Mar 26, 2022 at 04:53:19PM +0100, Roland Illig wrote: > The term "null-terminated string" is quite common when talking about C. NULL terminated lists/array are quite common, but NULL is a pointer and the string is terminated by a 0 char (sometimes spelled as \0 in a string literal, but implicitly added by the compiler at the end of a literal, and spelled as NUL in the ascii table). > I prefer to keep "null-terminated" here. I think it is a bug. When talking about it I prefer "zero terminated", or C-string, in contrast to C++ std::string (which are objects) or Pascal strings (which have an explicit length at the beginning). Martin
null-terminated vs. nul-terminated (was: Re: CVS commit: src/lib/libc/gen)
Am 24.03.2022 um 02:55 schrieb David H. Gutteridge: Module Name:src Committed By: gutteridge Date: Thu Mar 24 01:55:15 UTC 2022 Modified Files: src/lib/libc/gen: popen.3 Log Message: popen.3: minor spelling, grammar, style, and xref tweaks To generate a diff of this commit: cvs rdiff -u -r1.22 -r1.23 src/lib/libc/gen/popen.3 The term "null-terminated string" is quite common when talking about C. In contrast, the word "nul" in "nul-terminated" always reminds me of the character abbreviation in ASCII, which has a narrower scope than C. I prefer to keep "null-terminated" here. Roland
Re: CVS commit: src/lib/libc/time
In article <977b81a4-d330-6722-7ce4-cc4e61011...@gmx.de>, Roland Illig wrote: >Am 25.03.2022 um 22:25 schrieb Christos Zoulas: >> In article <20220325183551.0f039f...@cvs.netbsd.org>, >> Roland Illig wrote: >>> -=-=-=-=-=- >>> >>> Module Name:src >>> Committed By: rillig >>> Date: Fri Mar 25 18:35:50 UTC 2022 >>> >>> Modified Files: >>> src/lib/libc/time: localtime.c >>> >>> Log Message: >>> localtime.c: add back storage class 'register' >>> >>> This reduces the differences to the upstream code. >> >> I don't know why you did that. It is as simple to sed -e 's/register //g' >> in upstream when you diff. Adding register is useless; these days it is >> mostly ignored by compilers, except you can't &... > >I thought that having a small diff to the upstream code was more >important than a few saved keywords. If I was wrong, I can of course >revert it. I don't care too much, less clutter without obsolete keywords that do nothing or worse. christos
Re: CVS commit: src/lib/libc/time
Am 26.03.2022 um 00:50 schrieb Tobias Nygren: On Sat, 26 Mar 2022 00:31:45 +0100 Roland Illig wrote: localtime.c: add back storage class 'register' This reduces the differences to the upstream code. I don't know why you did that. It is as simple to sed -e 's/register //g' in upstream when you diff. Adding register is useless; these days it is mostly ignored by compilers, except you can't &... I thought that having a small diff to the upstream code was more important than a few saved keywords. If I was wrong, I can of course revert it. Can we keep it as-is, but #define register to empty at the top of the file? I don't think the compiler's view on the code is the main point of this discussion. At least, GCC generates the same code on x86_64, with or without 'register', so it doesn't seem to care. To me, the more interesting question is how human readers should deal with the code. If this piece of code had originated in the NetBSD tree, I wouldn't have added the 'register' keywords, I would have just indented the code according to KNF. But, most of this code is copied from the tz project, therefore our goal should be to make the diff minimal, in order to benefit from the testing of other projects using the same code. In this case, it means to keep the indentation exactly as inconsistent as upstream and also to keep the keyword 'register'. The remaining diff is still large enough to consume some developer time, containing a few fixes for WARNS=6 and for the lint checks. Roland
Re: CVS commit: src/lib/libc/time
On Sat, 26 Mar 2022 00:31:45 +0100 Roland Illig wrote: > >> localtime.c: add back storage class 'register' > >> > >> This reduces the differences to the upstream code. > > > > I don't know why you did that. It is as simple to sed -e 's/register //g' > > in upstream when you diff. Adding register is useless; these days it is > > mostly ignored by compilers, except you can't &... > > I thought that having a small diff to the upstream code was more > important than a few saved keywords. If I was wrong, I can of course > revert it. Can we keep it as-is, but #define register to empty at the top of the file?
Re: CVS commit: src/lib/libc/time
Am 25.03.2022 um 22:25 schrieb Christos Zoulas: In article <20220325183551.0f039f...@cvs.netbsd.org>, Roland Illig wrote: -=-=-=-=-=- Module Name:src Committed By: rillig Date: Fri Mar 25 18:35:50 UTC 2022 Modified Files: src/lib/libc/time: localtime.c Log Message: localtime.c: add back storage class 'register' This reduces the differences to the upstream code. I don't know why you did that. It is as simple to sed -e 's/register //g' in upstream when you diff. Adding register is useless; these days it is mostly ignored by compilers, except you can't &... I thought that having a small diff to the upstream code was more important than a few saved keywords. If I was wrong, I can of course revert it. Roland
Re: CVS commit: src/lib/libc/time
In article , Roland Illig wrote: > >The mess with the indentation comes from upstream. In our copy it's a >bit worth than upstream, but not much. I've asked upstream to indent consistently in the past. It has not happened. I try to minimize the diffs with upstream so that we can keep patching in our changes. I've also submitted our changes for inclusion to upstream. Some have been accepted, others have not. Best, christos
Re: CVS commit: src/lib/libc/time
In article <20220325183551.0f039f...@cvs.netbsd.org>, Roland Illig wrote: >-=-=-=-=-=- > >Module Name: src >Committed By: rillig >Date: Fri Mar 25 18:35:50 UTC 2022 > >Modified Files: > src/lib/libc/time: localtime.c > >Log Message: >localtime.c: add back storage class 'register' > >This reduces the differences to the upstream code. I don't know why you did that. It is as simple to sed -e 's/register //g' in upstream when you diff. Adding register is useless; these days it is mostly ignored by compilers, except you can't &... christos
Re: CVS commit: src/lib/libc/time
In article <20220325190016.15114f...@cvs.netbsd.org>, Roland Illig wrote: >-=-=-=-=-=- > >Module Name: src >Committed By: rillig >Date: Fri Mar 25 19:00:16 UTC 2022 > >Modified Files: > src/lib/libc/time: localtime.c > >Log Message: >localtime.c: take indentation style from upstream That I am fine with, but remember that some of the indentation changes in upstream were gratuitus (no code changes). christos
Re: CVS commit: src/lib/libc/time
Am 25.03.2022 um 13:59 schrieb Robert Elz: Date:Thu, 24 Mar 2022 23:32:30 +0100 From:Roland Illig Message-ID: <6bb00924-edaf-a4c8-348e-ba1304d57...@gmx.de> | Someone should clean up this mess. No, they probabky shouldn't, in general. That source comes from the tz project (currently from tzcode2022a) with local modifications. When I said "somebody" above, I really meant "somebody at the appropriate position". I already suspected that this would not be NetBSD. Sorry for the brevity. I think upstream mostly uses ts=4 (or more likely the emacs equivalent). we mostky use tab, and view with ts=8 and yes, that can mead to ugly results. The mess with the indentation comes from upstream. In our copy it's a bit worth than upstream, but not much. Roland
Re: CVS commit: src/lib/libc/time
Date:Thu, 24 Mar 2022 23:32:30 +0100 From:Roland Illig Message-ID: <6bb00924-edaf-a4c8-348e-ba1304d57...@gmx.de> | Someone should clean up this mess. No, they probabky shouldn't, in general. That source comes from the tz project (currently from tzcode2022a) with local modifications. I think upstream mostly uses ts=4 (or more likely the emacs equivalent). we mostky use tab, and view with ts=8 and yes, that can mead to ugly results. But to ease merging of future versions, this should be touched as little as possible. kre ps: one way and another we have plenty of localtime testing, not all in tests/libc. At least for general correctness. I don't know if all the various hard casez are as well tested locally, but changes that would affect thise generally come from upstream. Lots of people do lots of tests on the upstream code.
Re: CVS commit: src/lib/libc/time
Am 24.03.2022 um 17:15 schrieb Christos Zoulas: Module Name:src Committed By: christos Date: Thu Mar 24 16:15:05 UTC 2022 Modified Files: src/lib/libc/time: localtime.c Log Message: put back the 2022a changes and fix the misplaced brace. To generate a diff of this commit: cvs rdiff -u -r1.128 -r1.129 src/lib/libc/time/localtime.c The indentation of this file looks completely random: Line 272 has a comment indented by 2 spaces, the code in that block is indented by 4 spaces instead of the usual tab. Same for line 295. Line 350 is indented differently than line 346. Line 420 has a continuation that is indented by 2 spaces instead of the usual 4. Line 490 is indented by 2 spaces instead of the usual tab. Line 513 is indented by 4 spaces instead of the usual tab. Line 534 is indented by 2 spaces instead of the usual tab. Line 585 is indented by 2 spaces instead of the usual tab. Line 667 is indented by 2 tabs instead of the usual 1 tab. Line 683 is indented by 2 tabs instead of the usual tab. Line 688 is indented by 2 spaces instead of the usual tab. Line 1083 contains unaligned variable declarations. Line 2347 looks as if it were assuming a tab width of 4, but the immediately following line breaks this assumption. Someone should clean up this mess. I don't know how good the test coverage for localtime(3) is, so I'm happy to leave this (and a few possible refactorings to decrease the indentation) to people more familiar with the history and origins of this file. Roland
Re: CVS commit: src/lib/libc/time
On Wed, Mar 23, 2022 at 02:00:29PM +0100, Tobias Nygren wrote: > I think this commit broke something. > date(1) no longer reports a time zone and is stuck at GMT time: > > $ ls -l /etc/localtime > lrwxr-xr-x 1 root wheel 36 Mar 22 08:47 /etc/localtime -> > /usr/share/zoneinfo/Europe/Stockholm > $ date > Wed Mar 23 12:57:40 2022 > $ TZ=CET date > Wed Mar 23 12:57:44 2022 > $ TZ=foo date > Wed Mar 23 12:57:48 GMT 2022 The lib/libc/time tests are failing and e.g. this: zdump -v /usr/share/zoneinfo/US/Eastern gives a suprisingly short output. Martin
Re: CVS commit: src/lib/libc/time
On Tue, 22 Mar 2022 13:48:39 -0400 Christos Zoulas wrote: > Modified Files: > src/lib/libc/time: CONTRIBUTING Makefile NEWS localtime.c private.h > theory.html tz-link.html tzselect.8 tzselect.ksh version zdump.c > zic.c > > Log Message: > welcome to tzcode-2022a Hi, I think this commit broke something. date(1) no longer reports a time zone and is stuck at GMT time: $ ls -l /etc/localtime lrwxr-xr-x 1 root wheel 36 Mar 22 08:47 /etc/localtime -> /usr/share/zoneinfo/Europe/Stockholm $ date Wed Mar 23 12:57:40 2022 $ TZ=CET date Wed Mar 23 12:57:44 2022 $ TZ=foo date Wed Mar 23 12:57:48 GMT 2022 Kind regards, -Tobias
Re: CVS commit: src/lib/libc/stdlib
On Sat, 12 Mar 2022 08:26:01 + Nia Alarie wrote: > Module Name: src > Committed By: nia > Date: Sat Mar 12 08:26:01 UTC 2022 > > Modified Files: > src/lib/libc/stdlib: hcreate.c > > Log Message: > hcreate(3): use reallocarr instead of malloc(x * y) Caution: malloc(0) and reallocarr(, 0, s) have different semantics for the returned pointer value.
Re: CVS commit: src/lib/libc/stdlib
On Sat, Feb 12, 2022, 11:52 AM Taylor R Campbell < campbell+netbsd-source-change...@mumble.net> wrote: > > Date: Sun, 13 Feb 2022 05:44:38 +1100 > > from: matthew green > > > > "Roland Illig" writes: > > > Module Name:src > > > Committed By: rillig > > > Date: Fri Feb 11 21:36:46 UTC 2022 > > > > > > Modified Files: > > > src/lib/libc/stdlib: getenv.c > > > > > > Log Message: > > > libc/getenv: remove trailing whitespace > > > > > > No binary change. > > > > please avoid purely whitespace changes unless you're > > also going to be modifying the code itself. > > > > (don't back out now.) > > I thought we were supposed to _avoid_ mixing whitespace changes and > functional changes? > > (Obviously the solution is to just only ever commit code with perfect > style to begin with so this is never an issue! Speaking of which, > (setq show-trailing-whitespace t) is helpful to avoid this kind of > mistake up front. git diff or git add -p will also flag this kind of > mistake, if you're working in git. /usr/share/misc/NetBSD.el is also > helpful for other KNF style.) > I thought the rule was don't make purely whitespace changes UNLESS you plan on making other changes to (or are fixing an oops you just made). If that's the case, separate the two commits. Warner >
Re: CVS commit: src/lib/libc/stdlib
Am 12.02.2022 um 20:10 schrieb Warner Losh: I thought the rule was don't make purely whitespace changes UNLESS you plan on making other changes to (or are fixing an oops you just made). If that's the case, separate the two commits. That sounds like a useful rule, I'll stick to it. Roland
Re: CVS commit: src/lib/libc/stdlib
> Date: Sun, 13 Feb 2022 05:44:38 +1100 > from: matthew green > > "Roland Illig" writes: > > Module Name:src > > Committed By: rillig > > Date: Fri Feb 11 21:36:46 UTC 2022 > > > > Modified Files: > > src/lib/libc/stdlib: getenv.c > > > > Log Message: > > libc/getenv: remove trailing whitespace > > > > No binary change. > > please avoid purely whitespace changes unless you're > also going to be modifying the code itself. > > (don't back out now.) I thought we were supposed to _avoid_ mixing whitespace changes and functional changes? (Obviously the solution is to just only ever commit code with perfect style to begin with so this is never an issue! Speaking of which, (setq show-trailing-whitespace t) is helpful to avoid this kind of mistake up front. git diff or git add -p will also flag this kind of mistake, if you're working in git. /usr/share/misc/NetBSD.el is also helpful for other KNF style.)
re: CVS commit: src/lib/libc/stdlib
"Roland Illig" writes: > Module Name: src > Committed By: rillig > Date: Fri Feb 11 21:36:46 UTC 2022 > > Modified Files: > src/lib/libc/stdlib: getenv.c > > Log Message: > libc/getenv: remove trailing whitespace > > No binary change. please avoid purely whitespace changes unless you're also going to be modifying the code itself. (don't back out now.) thanks. .mrg.
Re: CVS commit: src/lib/libc/string
On Fri, Oct 29, 2021 at 10:11:57AM +, Nia Alarie wrote: > Module Name: src > Committed By: nia > Date: Fri Oct 29 10:11:57 UTC 2021 > > Modified Files: > src/lib/libc/string: wcsdup.c > > Log Message: > wcsdup(3): use reallocarr to catch integer overflow Except that no such integer overflow can happen, since the input string already has done the same computation. Joerg
Re: CVS commit: src/lib/libc/arch/powerpc/string
On 2021/07/25 6:32, Joerg Sonnenberger wrote: On Sat, Jul 24, 2021 at 05:27:26AM +, Rin Okuyama wrote: Module Name:src Committed By: rin Date: Sat Jul 24 05:27:26 UTC 2021 Modified Files: src/lib/libc/arch/powerpc/string: Makefile.inc Log Message: For evbppc, use C version of bcopy(3), memcpy(3), memcmp(3), and memmove(3) consistently for debug library (*.go) in order to avoid alignment faults for 403. Why do we want to pessimize all evbppc targets just for issues with the 403 design? Well, for kernel, we can readily switch unaffected machines to assembler versions. For userland, we can provide something like libc_ua.so for machines with capability of unaligned access. Some time, I will benchmark whether it is worth the cost. Thanks, rin
Re: CVS commit: src/lib/libc/arch/powerpc/string
On Sat, Jul 24, 2021 at 05:27:26AM +, Rin Okuyama wrote: > Module Name: src > Committed By: rin > Date: Sat Jul 24 05:27:26 UTC 2021 > > Modified Files: > src/lib/libc/arch/powerpc/string: Makefile.inc > > Log Message: > For evbppc, use C version of bcopy(3), memcpy(3), memcmp(3), and > memmove(3) consistently for debug library (*.go) in order to avoid > alignment faults for 403. Why do we want to pessimize all evbppc targets just for issues with the 403 design? Joerg
Re: CVS commit: src/lib/libc/arch/arm
Oops, this is broken. I will fix shortly... rin On 2021/06/30 8:29, Rin Okuyama wrote: Module Name:src Committed By: rin Date: Tue Jun 29 23:29:12 UTC 2021 Modified Files: src/lib/libc/arch/arm/gen: swapcontext.S src/lib/libc/arch/arm/sys: __clone.S Log Message: Align sp to 8-byte boundary as required by EABI. IIUC, this change only affects libc compiled for ``Thumb-mode userland'', which we've not officially supported yet. To generate a diff of this commit: cvs rdiff -u -r1.15 -r1.16 src/lib/libc/arch/arm/gen/swapcontext.S cvs rdiff -u -r1.9 -r1.10 src/lib/libc/arch/arm/sys/__clone.S Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files.
Re: vfork() and posix_spawn() [was: Re: CVS commit: src/lib/libc/sys]
Date:Mon, 14 Jun 2021 03:56:48 +0200 From:Joerg Sonnenberger Message-ID: | This is even more true for multi-threaded applications | (where POSIX explicitly suggests that requirement). Sure, anything with fork() and threads has issues, that's messy. Even I know that, and I know very little about threads. | On the specific topic, I'm somewhat puzzled by the claim that fork is | async-signal-safe That's not what I said, I said it isn't (though the way I phrased that might not have been all that clear - I know I often type too much, and sometimes overcompensate by typing too little). But (from my message): If the "that said" relates to fork() or vfork() not being async signal safe, so a double fork() (when the first is vfork anyway) would not be condoned, was meant to acknowledge that fork() (and vfork()) aren't async signal safe, and so if a requirement that async signal safe functions are all that is permitted after vfork() actually existed, then a if (vfork() == 0) fork(); sequence would not be permitted (not generate defined actions). But for the purposes of whether double fork() option to posix_spawn is useful or not, this restriction doesn't matter, as the above can be (eventually) written if (vfork() == 0) _Fork(); instead, as _Fork() is (to be) async signal safe. Same effect as the (perhaps) undefined version above for this purpose. | since most implementations will require synchronisation for pthread_atfork Yes, which is what _Fork() does not do (_Fork() is to fork() as _exit() is to exit() - in a sense). | It most likely should. The main reason is that much of the system can | and often do depend on things like mutexes to ensure correctness and all | that is essentially UB after vfork(). Actually, it isn't UB, or shouldn't be. The behaviour of vfork() is actually very precisely specified (which doesn't take much, as it is quite simple). There's very little room for UB. What there is is plenty of room for simple errors to screw things (and appear as if it is UB, when it is actually quite well defined broken code). There's no problem using mutexes after a vfork() with two caveats - first anything locked must be unlocked again before the eventual _exit() or exec() (the mutex must be returned to its state at the time of the vfork()) and the child must not need to lock anything which might be already locked in the parent (as that's a guaranteed deadlock). How one makes the latter condition work is for someone else to solve... That said, most of what would require a mutex isn't something that should be being done after a vfork() anyway - most of those operations are likely to be things which change state of the process, and that's what (at least without careful preparation) cannot happen in a child of a vfork() (so no use of stdio, no use of malloc, ...) | That's even ignoring the stronger issue of mutating state. Yes, though if it is intentional, that can actually be OK (sometimes). Our /bin/sh relies upon the ability to modify its parent's state after a vfork() to communicate status back to the parent (more than is possible reliably with exit status - as the parent cannot tell from that whether the status it receives is from its own image running the child after the vfork() or from some other process after an exec() succeeded). A modified volatile variable (in the parent), however, can only have been modified by the vfork() child (we assume here that the parent isn't modifying the variable itself elsewhere, naturally). And yes, that means that on any system that "implements" vfork() using cc -Dvfork=fork (or its equivalent) cannot use vfork() with our sh - we need a "real" vfork, or must simply disable its use and only use fork (which is what the -D implementation does anyway of course, just badly). | vfork use really should die... That sentiment has been around for a long time - almost since vfork() was first created (40+ years ago). But it is still here. Implementing fork() using CoW was supposed to solve all the issues with fork(). It didn't, vfork() is still lots faster. posix_spawn() might allow some uses of vfork()+exec() to be retired, that would be good, but it isn't going to get all of them. | No, it relates to one common pattern for used by or for daemon. Yes, I understood that, but why do we care? Daemons start how often? What percentage of the forks() (vfork is, in practice, never used for this) on your systems are generated by daemons starting? What kind of saving would you expect to see from allowing posix_spawn() to replace that fork();fork() sequence ? Is it really worth even the tiniest extra complexity in that already fairly complex interface to handle this almost irrelevant issue? Do you believe in non-memory managed real time systems there are any daemon processes like that at all? | There are non-trivial uses of fork, yes. There are much less
Re: vfork() and posix_spawn() [was: Re: CVS commit: src/lib/libc/sys]
On Mon, Jun 14, 2021 at 01:33:51AM +0700, Robert Elz wrote: > Date:Sat, 12 Jun 2021 23:13:54 +0200 > From:Joerg Sonnenberger > Message-ID: > > Sorry, missed this message when I was cherry picking messages to read in > a timely fashion. > > | On Wed, Jun 09, 2021 at 01:03:20AM +0700, Robert Elz wrote: > | > after a vfork() the child can do anything > > | This is not true. After vfork(), it is only supposed to call async > | signal safe functions. > > What you said, and what I said are not contradictory. Note I said "can > do" and you said "is only supposed to". > > What is supposed to work and what actually does work (or can be made to > work) are two different things. Sure, but the set of what works reliably has actually been shrinking over time and more bugs in various programs using vfork specifically have been discovered. This is even more true for multi-threaded applications (where POSIX explicitly suggests that requirement). On the specific topic, I'm somewhat puzzled by the claim that fork is async-signal-safe since most implementations will require synchronisation for pthread_atfork and that seems to place quite a burden on the implementation... > I would note though our man page doesn't make that requirement, or not > that I can see anyway, and vfork() is not in POSIX of course - and while > I don't have a copy to check, I'd suspect not in the C standards either), > so while that sounds like a reasonable requirement to impose for safer > operation, I'm not sure that anyone or anything actually does that. It most likely should. The main reason is that much of the system can and often do depend on things like mutexes to ensure correctness and all that is essentially UB after vfork(). That's even ignoring the stronger issue of mutating state. vfork use really should die... > > | That said, a flag for the double fork semantic might be useful. > > If the "that said" relates to fork() or vfork() not being async signal safe, > so a double fork() (when the first is vfork anyway) would not be condoned, > that doesn't matter, there to be an async-signal-safe _Fork() call added to > the next version of POSIX, so even with the "only async signal safe" > requirement for vfork() a: No, it relates to one common pattern for used by or for daemon. > Please don't see posix_spawn() as being (or ever becoming) a panacea that > can replace all fork() (or even just vfork()) calls - even just in the > cases where an exec() follows soon after. It works fine for most simple > cases, but there are lots of not-so-simple situations that it cannot handle, > and burdening the interface with the ability to deal with all of those > would reduce the "efficiently implementable" goal for little real benefit. There are non-trivial uses of fork, yes. There are much less non-trivial uses of vfork as the latter already has quite a few problems with spooky actions otherwise. Supporting something like a double fork flag has very little impact on the complexity of the implementation and even less impact on the efficiency. We certainly are at the point where we can start analyzing the remaining blockers for (v)fork+exec users. > Another example is made obvious by the bug Martin committed a fix for > in the past few days ... the order of operations was incorrect in our > implementation. But that this problem can exist shows that there > are ordering issues - a process that wants to open files using its > current privs, then reset those before (perhaps) opening more files, > and then doing the exec() part cannot do that with posix_spawn(). > Again, this is rare enough that adding the complexity required to allow > it to work just isn't worth it. [In this area also note that the > POSIX_SPAWN_RESETIDS flag causes (effectively) setuid(getuid()), there's > no similar operation to do setuid(geteuid()) ... again too rare to bother.] I quite disagree here, actually. The design-level issue is that POSIX_SPAWN_RESETIDS is a flag and not an action. This means it can't be sequenced and that is the reason for the limitation. There is an obvious parallel with the semantics of the chdir action here--that needs to be that, an action and not just a flag. The separate concern is of course that we need more testing for posix_spawn, but that is hopefully also going to become better as side effect of the non-GSoC project. Joerg
vfork() and posix_spawn() [was: Re: CVS commit: src/lib/libc/sys]
Date:Sat, 12 Jun 2021 23:13:54 +0200 From:Joerg Sonnenberger Message-ID: Sorry, missed this message when I was cherry picking messages to read in a timely fashion. | On Wed, Jun 09, 2021 at 01:03:20AM +0700, Robert Elz wrote: | > after a vfork() the child can do anything | This is not true. After vfork(), it is only supposed to call async | signal safe functions. What you said, and what I said are not contradictory. Note I said "can do" and you said "is only supposed to". What is supposed to work and what actually does work (or can be made to work) are two different things. I would note though our man page doesn't make that requirement, or not that I can see anyway, and vfork() is not in POSIX of course - and while I don't have a copy to check, I'd suspect not in the C standards either), so while that sounds like a reasonable requirement to impose for safer operation, I'm not sure that anyone or anything actually does that. | That said, a flag for the double fork semantic might be useful. If the "that said" relates to fork() or vfork() not being async signal safe, so a double fork() (when the first is vfork anyway) would not be condoned, that doesn't matter, there to be an async-signal-safe _Fork() call added to the next version of POSIX, so even with the "only async signal safe" requirement for vfork() a: if ((child = vfork()) == 0) { if (_Fork())_exit(0); /* do child stuff here */ } else { if (child != -1) waitpid(child, , 0); /* parent continues */ } sequence is possible (will be possible, we don't have _Fork() yet) and fully conforming, apart from vfork() not being in the standard. The child of the vfork() only does _Fork() and _exit(), both of which are permitted. The grandchild process is created by a full fork() so can do anything at all. For what it is worth, the definition of _Fork() is (to be): The _Fork() function shall be equivalent to fork(), except that fork handlers established by means of the pthread_atfork() function shall not be called and _Fork() shall be async-signal-safe. Aside from the prototype line, which is just as you'd expect, that's it. If (or when) we add it, we probably need to decide whether we need _VFork() (or some similar name) as well. Probably. However, an extra posix_spawn attribute to handle double fork is almost certainly not warranted, this kind of thing isn't all that common, and adding the mechanism to the posix_spawn set of functions would really just be bloat (there's also a proposal floating around to add the ability to change a terminal's process group as a posix_spawn action -- that's required for shells to be able to use posix_spawn in general cases, but that most likely won't go anywhere). Remember what the standard says about posix_spawn() (in its rationale, this is not strictly part of the standard)... The posix_spawn( ) function and its close relation posix_spawnp( ) have been introduced to overcome the following perceived difficulties with fork( ): the fork( ) function is difficult or impossible to implement without swapping or dynamic address translation. [...] This view of the role of posix_spawn( ) and posix_spawnp( ) influenced the design of their API. It does not attempt to provide the full functionality of fork( )/exec in which arbitrary user-specified operations of any sort are permitted between the creation of the child process and the execution of the new process image; any attempt to reach that level would need to provide a programming language as parameters. [...] [ It lists some of the known problems with the posix_spawn interface ] [...] The posix_spawn( ) and posix_spawnp( ) functions do not have all the power of fork( )/exec. This is to be expected. The fork( ) function is a wonderfully powerful operation. We do not expect to duplicate its functionality in a simple, fast function with no special hardware requirements. It is worth noting that posix_spawn( ) and posix_spawnp( ) are very similar to the process creation operations on many operating systems that are not UNIX systems. It then goes on to list the requirements they had for posix_spawn()... The requirements for posix_spawn( ) and posix_spawnp( ) are: � They must be implementable without an MMU or unusual hardware. � They must be compatible with existing POSIX standards. Additional goals are: � They should be efficiently implementable. � They should be able to replace at least 50% of typical executions of fork( ). � A system with posix_spawn( ) and posix_spawnp( ) and without fork( ) should be useful, at least for realtime applications. � A
Re: CVS commit: src/lib/libc/sys
On Wed, Jun 09, 2021 at 01:03:20AM +0700, Robert Elz wrote: > It should, when it is workable for the application, make things > faster, while also avoiding the vfork() perils. But only when > it works -- after a vfork() the child can do anything (it should > avoid touching its parent's state, but it can) posix_spawn() isn't > nearly that general, it just handles the most common things apps > are likely to want to do between the fork() and exec(). This is not true. After vfork(), it is only supposed to call async signal safe functions. That said, a flag for the double fork semantic might be useful. Joerg
Re: CVS commit: src/lib/libc/sys
Date:Tue, 8 Jun 2021 16:15:12 + From:"Nia Alarie" Message-ID: <20210608161512.1d7c3f...@cvs.netbsd.org> | vfork.2: recommend posix_spawn instead You might want to reconsider the wording there, posix_spawn() is an alternative to [v]fork() + exec*(). Not just vfork(). It should, when it is workable for the application, make things faster, while also avoiding the vfork() perils. But only when it works -- after a vfork() the child can do anything (it should avoid touching its parent's state, but it can) posix_spawn() isn't nearly that general, it just handles the most common things apps are likely to want to do between the fork() and exec(). Perhaps most notably, it has no notion of (with err checking omitted): if (fork()) { parent_code(/* do a waitpid() first */); } else if (fork()) { exit(0); } else { child_code() /* often ending with execX() */ } when one wants to create an orphan (nothing that cannot be waited upon). kre
Re: CVS commit: src/lib/libc/regex
In article <5c9e716-7ec1-9c7d-a7cb-21f08946...@invisible.ca>, Jared McNeill wrote: >Building tools on macOS: > >/Users/jmcneill/netbsd/git-src/tools/compat/../../lib/libc/regex/regcomp.c:1585:8: > >error: implicit declaration of function 'reallocarray' is invalid > in C99 [-Werror,-Wimplicit-function-declaration] > ncs = reallocarray(p->g->sets, p->g->ncsets + 1, sizeof(*ncs)); > ^ >/Users/jmcneill/netbsd/git-src/tools/compat/../../lib/libc/regex/regcomp.c:1585:8: > >note: did you mean 'reallocarr'? >/Users/jmcneill/netbsd/git-src/tools/compat/compat_defs.h:556:5: note: >'reallocarr' declared here >int reallocarr(void *, size_t, size_t); > ^ Fixed, thanks! christos
Re: CVS commit: src/lib/libc/regex
Building tools on macOS: /Users/jmcneill/netbsd/git-src/tools/compat/../../lib/libc/regex/regcomp.c:1585:8: error: implicit declaration of function 'reallocarray' is invalid in C99 [-Werror,-Wimplicit-function-declaration] ncs = reallocarray(p->g->sets, p->g->ncsets + 1, sizeof(*ncs)); ^ /Users/jmcneill/netbsd/git-src/tools/compat/../../lib/libc/regex/regcomp.c:1585:8: note: did you mean 'reallocarr'? /Users/jmcneill/netbsd/git-src/tools/compat/compat_defs.h:556:5: note: 'reallocarr' declared here int reallocarr(void *, size_t, size_t); ^ On Tue, 23 Feb 2021, Christos Zoulas wrote: Module Name:src Committed By: christos Date: Tue Feb 23 22:14:59 UTC 2021 Modified Files: src/lib/libc/regex: cname.h engine.c re_format.7 regcomp.c regerror.c regex.3 regex2.h regexec.c regfree.c utils.h Removed Files: src/lib/libc/regex: cclass.h Log Message: sync with FreeBSD: - NLS support - GNU extensions - bug fixes To generate a diff of this commit: cvs rdiff -u -r1.7 -r0 src/lib/libc/regex/cclass.h cvs rdiff -u -r1.7 -r1.8 src/lib/libc/regex/cname.h cvs rdiff -u -r1.24 -r1.25 src/lib/libc/regex/engine.c cvs rdiff -u -r1.12 -r1.13 src/lib/libc/regex/re_format.7 cvs rdiff -u -r1.38 -r1.39 src/lib/libc/regex/regcomp.c cvs rdiff -u -r1.23 -r1.24 src/lib/libc/regex/regerror.c cvs rdiff -u -r1.26 -r1.27 src/lib/libc/regex/regex.3 cvs rdiff -u -r1.13 -r1.14 src/lib/libc/regex/regex2.h cvs rdiff -u -r1.22 -r1.23 src/lib/libc/regex/regexec.c cvs rdiff -u -r1.15 -r1.16 src/lib/libc/regex/regfree.c cvs rdiff -u -r1.6 -r1.7 src/lib/libc/regex/utils.h Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files.
Re: CVS commit: src/lib/libc/gen
On Wed, Feb 17, 2021 at 11:51:04PM +, David A. Holland wrote: > Also, Someone(TM) should check if POSIX permits this or if we ought to > improve the implementation. Unsurprisingly, POSIX is silent. It just says "rewinddir shall also cause the directory stream to refer to the current state of the corresponding directory, as a call to opendir() would have done," basically the same text we had previously. This could be read as _mandating_ reopening it, but that is almost certainly not what they intended, so I think we're ok. https://pubs.opengroup.org/onlinepubs/9699919799/functions/opendir.html https://pubs.opengroup.org/onlinepubs/9699919799/functions/rewinddir.html -- David A. Holland dholl...@netbsd.org
Re: CVS commit: src/lib/libc/locale
On Mon, Feb 15, 2021 at 04:37:15PM +0100, Thomas Klausner wrote: > Hi! > > Thanks for the man pages. > > There is none for uselocale(3), which is heavily referenced from > these, nor do I find a prototype in /usr/include... > so how does one use these? :) We don't support uselocale. You use this functions by passing the locale_t to the *_l functions directly. Joerg
Re: CVS commit: src/lib/libc/locale
Hi! Thanks for the man pages. There is none for uselocale(3), which is heavily referenced from these, nor do I find a prototype in /usr/include... so how does one use these? :) Thomas On Mon, Feb 15, 2021 at 09:35:04AM -0500, Christos Zoulas wrote: > Module Name: src > Committed By: christos > Date: Mon Feb 15 14:35:04 UTC 2021 > > Modified Files: > src/lib/libc/locale: Makefile.inc > Added Files: > src/lib/libc/locale: duplocale.3 freelocale.3 newlocale.3 > > Log Message: > Add missing man pages (from FreeBSD) > > > To generate a diff of this commit: > cvs rdiff -u -r1.64 -r1.65 src/lib/libc/locale/Makefile.inc > cvs rdiff -u -r0 -r1.1 src/lib/libc/locale/duplocale.3 \ > src/lib/libc/locale/freelocale.3 src/lib/libc/locale/newlocale.3 > > Please note that diffs are not public domain; they are subject to the > copyright notices on the relevant files. > > Modified files: > > Index: src/lib/libc/locale/Makefile.inc > diff -u src/lib/libc/locale/Makefile.inc:1.64 > src/lib/libc/locale/Makefile.inc:1.65 > --- src/lib/libc/locale/Makefile.inc:1.64 Sun Aug 18 16:03:48 2013 > +++ src/lib/libc/locale/Makefile.inc Mon Feb 15 09:35:04 2021 > @@ -1,5 +1,5 @@ > #from: @(#)Makefile.inc 5.1 (Berkeley) 2/18/91 > -#$NetBSD: Makefile.inc,v 1.64 2013/08/18 20:03:48 joerg Exp $ > +#$NetBSD: Makefile.inc,v 1.65 2021/02/15 14:35:04 christos Exp $ > > # locale sources > .PATH: ${ARCHDIR}/locale ${.CURDIR}/locale > @@ -21,7 +21,7 @@ CPPFLAGS.runetable.c+= -I${LIBCDIR}/cit > CPPFLAGS.multibyte_c90.c+= -I${LIBCDIR}/citrus > CPPFLAGS.multibyte_amd1.c+= -I${LIBCDIR}/citrus > > -MAN+=setlocale.3 nl_langinfo.3 > +MAN+=duplocale.3 freelocale.3 newlocale.3 setlocale.3 nl_langinfo.3 > > MAN+=mbtowc.3 mbstowcs.3 wctomb.3 wcstombs.3 mblen.3 \ > > > Added files: > > Index: src/lib/libc/locale/duplocale.3 > diff -u /dev/null src/lib/libc/locale/duplocale.3:1.1 > --- /dev/null Mon Feb 15 09:35:04 2021 > +++ src/lib/libc/locale/duplocale.3 Mon Feb 15 09:35:04 2021 > @@ -0,0 +1,80 @@ > +.\" $NetBSD: duplocale.3,v 1.1 2021/02/15 14:35:04 christos Exp $ > +.\" Copyright (c) 2011 The FreeBSD Foundation > +.\" All rights reserved. > +.\" > +.\" This documentation was written by David Chisnall under sponsorship from > +.\" the FreeBSD Foundation. > +.\" > +.\" Redistribution and use in source and binary forms, with or without > +.\" modification, are permitted provided that the following conditions > +.\" are met: > +.\" 1. Redistributions of source code must retain the above copyright > +.\"notice, this list of conditions and the following disclaimer. > +.\" 2. Redistributions in binary form must reproduce the above copyright > +.\"notice, this list of conditions and the following disclaimer in the > +.\"documentation and/or other materials provided with the distribution. > +.\" > +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND > +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE > +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR > PURPOSE > +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE > +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR > CONSEQUENTIAL > +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS > +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) > +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, > STRICT > +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY > +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF > +.\" SUCH DAMAGE. > +.\" > +.\" $FreeBSD: head/lib/libc/locale/duplocale.3 281925 2015-04-24 10:17:55Z > theraven $ > +.\" > +.Dd September 17, 2011 > +.Dt DUPLOCALE 3 > +.Os > +.Sh NAME > +.Nm duplocale > +.Nd duplicate an locale > +.Sh LIBRARY > +.Lb libc > +.Sh SYNOPSIS > +.In locale.h > +.Ft locale_t > +.Fn duplocale "locale_t locale" > +.Sh DESCRIPTION > +Duplicates an existing > +.Fa locale_t > +returning a new > +.Fa locale_t > +that refers to the same locale values but has an independent internal state. > +Various functions, such as > +.Xr mblen 3 > +require a persistent state. > +These functions formerly used static variables and calls to them from > multiple > +threads had undefined behavior. > +They now use fields in the > +.Fa locale_t > +associated with the current thread by > +.Xr uselocale 3 . > +These calls are therefore only thread safe on threads with a unique > per-thread > +locale. > +The locale returned by this call must be freed with > +.Xr freelocale 3 . > +.Sh SEE ALSO > +.Xr freelocale 3 , > +.Xr localeconv 3 , > +.Xr newlocale 3 , > +.\" .Xr querylocale 3 , > +.Xr uselocale 3 , > +.\" .Xr xlocale 3 > +.Sh STANDARDS > +This function conforms to > +.St -p1003.1-2008 . > +.Sh BUGS > +Ideally, > +.Xr uselocale 3 > +should make a
Re: CVS commit: src/lib/libc/stdio
Date:Mon, 1 Feb 2021 17:50:54 + From:"Jaromir Dolecek" Message-ID: <20210201175054.112e7f...@cvs.netbsd.org> | FreeBSD has a similar check, but they return EINVAL instead, feel | free to adjust if SUS or other standard mandates specific value Not currently (unless the C standard says something) - but EINVAL would be a better choice, fread() incorporates all errors from fgetc() (and fwrite() from fputc()) and there EOVERFLOW is used to indicate that the file offset exceeds the maximum possible file size, which is quite a different problem. EINVAL is currently unused (at least in POSIX) but makes more sense for this problem (which is am invalid arg combination). FWIW, I submitted a defect report on this issue with the Austin Group (POSIX maintainers) yesterday, though I am expecting they'll simply punt it to the C standards group. kre
Re: CVS commit: src/lib/libc/stdio
Date:Sun, 31 Jan 2021 18:34:22 +0100 From:Joerg Sonnenberger Message-ID: | That makes no sense. Just turn them into a short read, which is | something users have to deal with anyway. I'm not sure I agree with that one. If the user's size * nmemb overflows a size_t then the user is attempting something entirely insane (they cannot possibly have a buffer that big). Reading anything at all is most probably the wrong thing to do, as the args are very likely simply erroneous. EINVAL is not an unreasonable approach, I wouldn't even object to simply abort(). kre
Re: CVS commit: src/lib/libc/stdio
On 31.01.2021 18:34, Joerg Sonnenberger wrote: > On Sun, Jan 31, 2021 at 05:27:28PM +0100, Kamil Rytarowski wrote: >> On 31.01.2021 17:18, Jaromir Dolecek wrote: >>> Module Name:src >>> Committed By: jdolecek >>> Date: Sun Jan 31 16:18:22 UTC 2021 >>> >>> Modified Files: >>> src/lib/libc/stdio: fread.c >>> >>> Log Message: >>> for unbuffered I/O arrange for the destination buffer to be filled in one >>> go, instead of triggering long series of 1 byte read(2)s; this speeds up >>> fread() several order of magnitudes for this case, directly proportional >>> to the size of the supplied buffer >>> >>> change adapted from OpenBSD rev. 1.19 >>> >>> fixes PR lib/55808 by Roland Illig >>> >> >> While there it would be cool to apply FreeBSD and OpenBSD hardening for >> invalid size x nmemb, checking for overflow. I propose to return EINVAL >> in such case. > > That makes no sense. Just turn them into a short read, which is > something users have to deal with anyway. > > Joerg > The purpose of this errno is to catch insage usage of API that in normal circumstances never makes sense. With overflows (that are fine), the error can be unnoticed. Both FreeBSD and OpenBSD found this behavior as useful.
Re: CVS commit: src/lib/libc/stdio
On Sun, Jan 31, 2021 at 05:27:28PM +0100, Kamil Rytarowski wrote: > On 31.01.2021 17:18, Jaromir Dolecek wrote: > > Module Name:src > > Committed By: jdolecek > > Date: Sun Jan 31 16:18:22 UTC 2021 > > > > Modified Files: > > src/lib/libc/stdio: fread.c > > > > Log Message: > > for unbuffered I/O arrange for the destination buffer to be filled in one > > go, instead of triggering long series of 1 byte read(2)s; this speeds up > > fread() several order of magnitudes for this case, directly proportional > > to the size of the supplied buffer > > > > change adapted from OpenBSD rev. 1.19 > > > > fixes PR lib/55808 by Roland Illig > > > > While there it would be cool to apply FreeBSD and OpenBSD hardening for > invalid size x nmemb, checking for overflow. I propose to return EINVAL > in such case. That makes no sense. Just turn them into a short read, which is something users have to deal with anyway. Joerg
Re: CVS commit: src/lib/libc/stdio
On 31.01.2021 17:18, Jaromir Dolecek wrote: > Module Name: src > Committed By: jdolecek > Date: Sun Jan 31 16:18:22 UTC 2021 > > Modified Files: > src/lib/libc/stdio: fread.c > > Log Message: > for unbuffered I/O arrange for the destination buffer to be filled in one > go, instead of triggering long series of 1 byte read(2)s; this speeds up > fread() several order of magnitudes for this case, directly proportional > to the size of the supplied buffer > > change adapted from OpenBSD rev. 1.19 > > fixes PR lib/55808 by Roland Illig > While there it would be cool to apply FreeBSD and OpenBSD hardening for invalid size x nmemb, checking for overflow. I propose to return EINVAL in such case. I planned to submit it a while ago for discussion.
Re: CVS commit: src/lib/libc/string
Date:Sat, 4 Apr 2020 21:16:45 + From:David Holland Message-ID: <20200404211645.ga19...@netbsd.org> | I fail to see any scenario in which it's legitimate for an application | to scribble in internal data belonging to libc. Why should this be | permitted? You've never done something like p = getenv("PATH"); /* NULL check omitted here */ while (q = strchr(p, ':')) { *q = '\0'; /* use p to do a lookup, printf, or whatever */ *q = ':'; p = q + 1; } /* use p here too */ ?? If you have, what's the difference, genenv() is also returning data from libc, it could also be const char *, but isn't. In neither case is this what would normally be called "internal data" though, it isn't as if we're directly modifying the data structs malloc() uses (that would be internal data) - these are the results returned from libc to the application. | I don't understand. It is a bug that the library returns a writeable | pointer to data that should not be modified. But we aren't returning a pointer to data that can't be modified, libc doesn't care, one way or the other. Apps shouldn't modify it (shouldn't attempt to) because of portability concerns with other implementations.If we did as Joerg suggested and mmapped the message catalog pages for the relevant locale, and simply returned a pointer to the relevant entry, one of those could be us.But isn't. | Anyway, this is moot because strerror is defined by C; That was my point. The entry in the BUGS section of the man page was merely a rant, as this isn't (wasn't ever) something that can be fixed. | but because it *should* be const, | it should be (and is) declared with __aconst in string.h. | | Please don't change that. Not planning anything like that - this was all about the man page. | Also, perhaps we should swipe the text about not modifyingthe string | buffer from strsignal.3. I believe that what is in strerror(3) now is more or less aligned with the intent (if not the actual language) of that. Note that strsignal(3) doesn't contain a BUGS section ranting about how it should really be const char *strsignal(). kre
Re: CVS commit: src/lib/libc/string
On Thu, Mar 26, 2020 at 10:54:21AM +0700, Robert Elz wrote: > | I don't agree -- because applications shouldn't attempt to modify the > | result, it should be const. > > The only reason apps shouldn't modify the string is in case of > porting the app to an (well, some) ancient implementation. Because > of the NLS requirements, the message these days (any modern > implementation) must be read from some external file - which means > the storage for it must be writable. Nothing else (except the > calling thread) cares about the content of the returned message, so > there's no actual reason for the app to not modify it now if for > some wacky reason it wants to. I fail to see any scenario in which it's legitimate for an application to scribble in internal data belonging to libc. Why should this be permitted? > The entry that was in BUGS wasn't a bug - what would be a bug would > be if we actually made strerror() return const char * - it wasn't > even a limitation (which we traditionally include in BUGS, even > though they're generally by design, and not accident, and not > really intended to be "fixed") - what it was was a rant about the > original design spec, and what's more, one which is no longer > warranted. I don't understand. It is a bug that the library returns a writeable pointer to data that should not be modified. Anyway, this is moot because strerror is defined by C; but because it *should* be const, it should be (and is) declared with __aconst in string.h. Please don't change that. Also, perhaps we should swipe the text about not modifyingthe string buffer from strsignal.3. -- David A. Holland dholl...@netbsd.org
Re: CVS commit: src/lib/libc/string
On Thu, Mar 26, 2020 at 10:54:21AM +0700, Robert Elz wrote: > Date:Wed, 25 Mar 2020 20:51:25 + > From:David Holland > Message-ID: <20200325205125.ga11...@netbsd.org> > > | I don't agree -- because applications shouldn't attempt to modify the > | result, it should be const. > > The only reason apps shouldn't modify the string is in case of porting > the app to an (well, some) ancient implementation. Because of the NLS > requirements, the message these days (any modern implementation) must be > read from some external file - which means the storage for it must be > writable. Actually, the only reason why we really need writable space is the unknown error case. NLS could in principle be using mmaped data as well, modulo not being able to unmap it again. Joerg
Re: CVS commit: src/lib/libc/string
Date:Wed, 25 Mar 2020 20:51:25 + From:David Holland Message-ID: <20200325205125.ga11...@netbsd.org> | I don't agree -- because applications shouldn't attempt to modify the | result, it should be const. The only reason apps shouldn't modify the string is in case of porting the app to an (well, some) ancient implementation. Because of the NLS requirements, the message these days (any modern implementation) must be read from some external file - which means the storage for it must be writable. Nothing else (except the calling thread) cares about the content of the returned message, so there's no actual reason for the app to not modify it now if for some wacky reason it wants to. Note that when strerror_l() was created, the opportunity to make that return const char * existed - const existed in C by then, which it didn't when strerror() was invented - but wasn't taken, as it was clear by its very design that strerror_l had to return a pointer to writable memory, so it was made char * and not const char *. | What it points to internally doesn't matter... True, if the interface had been to return const char *. The entry that was in BUGS wasn't a bug - what would be a bug would be if we actually made strerror() return const char * - it wasn't even a limitation (which we traditionally include in BUGS, even though they're generally by design, and not accident, and not really intended to be "fixed") - what it was was a rant about the original design spec, and what's more, one which is no longer warranted. kre ps: I've never seen, and cannot really imagine, an app that would ever want to modify the result from strerror(), so none of this really matters anyway.
Re: CVS commit: src/lib/libc/string
On Wed, Mar 25, 2020 at 06:50:47PM +, Robert Elz wrote: > Modified Files: > src/lib/libc/string: strerror.3 > > Log Message: > Delete the BUGS paragraph about the "missing" const qualifier for the > result type of strerror() (and strerror_l()). While that once should > really have been present, when strerror() was invented, there was no > "const" qualifier in C to apply, and now the way the code is writtem > (really needs to be because of NLS support) the const is no longer > really appropriate. > > Applications still shouldn't attempt to modify the result however. I don't agree -- because applications shouldn't attempt to modify the result, it should be const. What it points to internally doesn't matter... -- David A. Holland dholl...@netbsd.org
Switch to compiler_rt from assembler codes? (Re: CVS commit: src/lib/libc/compiler_rt)
(added port-sun2@) On 2020/03/09 3:33, Joerg Sonnenberger wrote: On Sun, Mar 08, 2020 at 06:30:06AM +, Rin Okuyama wrote: Module Name:src Committed By: rin Date: Sun Mar 8 06:30:06 UTC 2020 Modified Files: src/lib/libc/compiler_rt: Makefile.inc Log Message: Fix broken printf(3) %d output for numbers more than two digits, e.g., printf("%d\n", 42) ---> "::" instead of "42" Our __{,u}modsi3 codes assume that __udivsi3 returns remainder to %d1 (volatile register). __udivsi3 in libgcc does not, and therefore mixing them up results in mess. This looks like the wrong fix to me. This is not about libgcc at all. The depency in the src/common code should be fixed IMO. If __{,u}modsi3 wants to make assumptions about the remainder being implicitly computed, it really should be using the divmod primitive instead. Strictly speaking, this is bug in __{,u}modsi3. However: (1) This is for optimization. There is no need for stack push and pop if they can use our special version of __udivsi3, instead of __divmodsi4. If they were ``fixed'' correctly, the resulted codes would have no big difference from the C version provided by compiler_rt. (2) Problems occur only when they are mixed with libgcc version of __udivsi3. For standalone programs, i.e., kernel and bootloader, this ``wrong'' version works well. (3) They are used only for m68000 (68010; sun2). For m68k (68020 and later; other m68k ports), GCC does not generate references to these functions at all. Therefore, I propose: (A) Continue to use these assembler codes only in standalone programs (kernel and bootloader). For userland (libc), switch to the C version provided by compiler_rt. Again, this affects only for 68010, i.e., sun2, and not for other m68k ports. Alternatively, we can: (b) Switch both standalone and userland to compiler_rt. I've checked both (A) and (B) version of live-image work fine on TME. I prefer (A) over (B), since there is no problem if they are used only in a controlled situation. Also, we are suffering from increase in kernel size on sun2. But how do you everyone think? Thanks, rin
Re: CVS commit: src/lib/libc/compiler_rt
On Sun, Mar 08, 2020 at 06:30:06AM +, Rin Okuyama wrote: > Module Name: src > Committed By: rin > Date: Sun Mar 8 06:30:06 UTC 2020 > > Modified Files: > src/lib/libc/compiler_rt: Makefile.inc > > Log Message: > Fix broken printf(3) %d output for numbers more than two digits, e.g., > > printf("%d\n", 42) ---> "::" instead of "42" > > Our __{,u}modsi3 codes assume that __udivsi3 returns remainder to > %d1 (volatile register). __udivsi3 in libgcc does not, and therefore > mixing them up results in mess. This looks like the wrong fix to me. This is not about libgcc at all. The depency in the src/common code should be fixed IMO. If __{,u}modsi3 wants to make assumptions about the remainder being implicitly computed, it really should be using the divmod primitive instead. Joerg
Re: CVS commit: src/lib/libc/compiler_rt
Oops, forgot to note that this is only for 68010, and other systems including m68k are not affected. Thanks, rin On 2020/03/08 15:30, Rin Okuyama wrote: Module Name:src Committed By: rin Date: Sun Mar 8 06:30:06 UTC 2020 Modified Files: src/lib/libc/compiler_rt: Makefile.inc Log Message: Fix broken printf(3) %d output for numbers more than two digits, e.g., printf("%d\n", 42) ---> "::" instead of "42" Our __{,u}modsi3 codes assume that __udivsi3 returns remainder to %d1 (volatile register). __udivsi3 in libgcc does not, and therefore mixing them up results in mess. To generate a diff of this commit: cvs rdiff -u -r1.36 -r1.37 src/lib/libc/compiler_rt/Makefile.inc Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files. Modified files: Index: src/lib/libc/compiler_rt/Makefile.inc diff -u src/lib/libc/compiler_rt/Makefile.inc:1.36 src/lib/libc/compiler_rt/Makefile.inc:1.37 --- src/lib/libc/compiler_rt/Makefile.inc:1.36 Tue Oct 29 16:08:50 2019 +++ src/lib/libc/compiler_rt/Makefile.inc Sun Mar 8 06:30:06 2020 @@ -1,4 +1,4 @@ -# $NetBSD: Makefile.inc,v 1.36 2019/10/29 16:08:50 joerg Exp $ +# $NetBSD: Makefile.inc,v 1.37 2020/03/08 06:30:06 rin Exp $ COMPILER_RT_SRCDIR= ${NETBSDSRCDIR}/sys/external/bsd/compiler_rt/dist @@ -170,9 +170,16 @@ GENERIC_SRCS+= \ GENERIC_SRCS+= \ divmodsi4.c \ divsi3.c \ - modsi3.c \ udivmodsi4.c \ + +. if ${LIBC_MACHINE_ARCH} != "m68000" +# Our __{,u}modsi3 codes assume that __udivsi3 returns remainder to +# %d1 (volatile register). __udivsi3 in libgcc does not, and therefore +# mixing them up results in mess. +GENERIC_SRCS+= \ + modsi3.c \ umodsi3.c +. endif . if ${LIBC_MACHINE_CPU} != "sh3" # On sh3 __udivsi3 is gcc "millicode" with special calling convention
Re: CVS commit: src/lib/libc/stdlib
On Sun, Feb 23, 2020 at 10:57:28 +0100, Kamil Rytarowski wrote: > On 23.02.2020 08:46, Martin Husemann wrote: > > > Source code consistency is a very important stylistic plus, every break of > > that should be accompanied by a comment. > > Done. Thank you. -uwe
Re: CVS commit: src/lib/libc/stdlib
On 23.02.2020 08:46, Martin Husemann wrote: > On Sun, Feb 23, 2020 at 03:35:19AM +0100, Kamil Rytarowski wrote: >> Algorithm would be changed from calculating on 32bit numbers with signed >> integer overflows to an algorithm calculating on 64bit numbers. The >> __dorand48() function truncates the result to least significant 16bits >> only so it does not matter. I retained operations on 32bits avoiding >> changes of types for stylistic reasons. > > I am with uwe here - either it would not make any difference at all (on > 32bit architectures) or it would end up with the same results and would > make no performance difference (on 64 bit architectures), so going with > the consistent (unsigned long) would have been fine. > > Even better would be a cleanup to make it (uint32_t) everwhere, but of > course only after carefull examination. > > Source code consistency is a very important stylistic plus, every break of > that should be accompanied by a comment. > > Martin > Done. signature.asc Description: OpenPGP digital signature
Re: CVS commit: src/lib/libc/stdlib
On Sun, Feb 23, 2020 at 03:35:19AM +0100, Kamil Rytarowski wrote: > Algorithm would be changed from calculating on 32bit numbers with signed > integer overflows to an algorithm calculating on 64bit numbers. The > __dorand48() function truncates the result to least significant 16bits > only so it does not matter. I retained operations on 32bits avoiding > changes of types for stylistic reasons. I am with uwe here - either it would not make any difference at all (on 32bit architectures) or it would end up with the same results and would make no performance difference (on 64 bit architectures), so going with the consistent (unsigned long) would have been fine. Even better would be a cleanup to make it (uint32_t) everwhere, but of course only after carefull examination. Source code consistency is a very important stylistic plus, every break of that should be accompanied by a comment. Martin
Re: CVS commit: src/lib/libc/stdlib
On Sun, Feb 23, 2020 at 03:35:19 +0100, Kamil Rytarowski wrote: > On 23.02.2020 03:20, Valery Ushakov wrote: > > On Sun, Feb 23, 2020 at 02:51:49 +0100, Kamil Rytarowski wrote: > > > >> On 23.02.2020 02:29, Valery Ushakov wrote: > >>> On Sat, Feb 22, 2020 at 14:07:57 +, Kamil Rytarowski wrote: > >>> > Module Name: src > Committed By:kamil > Date:Sat Feb 22 14:07:57 UTC 2020 > > Modified Files: > src/lib/libc/stdlib: _rand48.c > > Log Message: > Avoid undefined behavior in the rand48(3) implementation > > Instead of implicid promotion to signed int, > explicitly cast the arguments to unsigned int. > >>> > >>> Please, please, please, pay at least some attention to what is going > >>> on around the code you are changing. > >>> > >>> If there's already code in this function that does: > >>> > >>>accu = (unsigned long) __rand48_mult[0] * (unsigned long) xseed[0]; > >>> > >>> then keep it consistent and don't do casts to a different type > >>> > >>>accu += (unsigned int) __rand48_mult[0] * (unsigned int) xseed[2]; > >> > >> cast to unsigned long still works, but changes algorithm. My change was > >> performed deliberately. On the other hand and according to local tests > >> the end-result for unsigned long produces the same reults as cast to > >> unsigned int and unsigned char so it does not matter. > > > > I cannot make sense of your answer. Does the cast to unsigned long > > there change the algorithm or does it produce the same result? If it > > produces the same result, then it should be used to be consistent with > > the rest of the code (or the rest of the code changed to use unsigned > > int). If it does change the result, there should be a comment > > explaining it. > > Algorithm would be changed from calculating on 32bit numbers with signed > integer overflows to an algorithm calculating on 64bit numbers. The > __dorand48() function truncates the result to least significant 16bits > only so it does not matter. I retained operations on 32bits avoiding > changes of types for stylistic reasons. That still doesn't make sense to me. You took your time to figure out whats going on in this bit of code. Then you make a change that looks extremely unobvious b/c it is inconsistent with the rest of the code, and you say you did this for stylistic reasons. The next person looking at that code (in $bignum years) will have to waste their time puzzling out the reason. Why not use the knowledge you've gained of this code for good and change the code properly? The 90s unsigned long was probably meant to be 32-bit anyway (cf. X11 mess up with using long for 32 bit quantities in the protocol and then running into issues when 64-bit happened). So if doing things in 32-bit here is the right and intended thing to do, then change that unsigned long to uint32_t. If you don't want to dive that deep (which is entirely understandable), then, exactly for stylistic reasons, cast to unsigned long to be consistent with the old code - as you already established that it didn't change the result. -uwe
Re: CVS commit: src/lib/libc/stdlib
On 23.02.2020 03:20, Valery Ushakov wrote: > On Sun, Feb 23, 2020 at 02:51:49 +0100, Kamil Rytarowski wrote: > >> On 23.02.2020 02:29, Valery Ushakov wrote: >>> On Sat, Feb 22, 2020 at 14:07:57 +, Kamil Rytarowski wrote: >>> Module Name: src Committed By: kamil Date: Sat Feb 22 14:07:57 UTC 2020 Modified Files: src/lib/libc/stdlib: _rand48.c Log Message: Avoid undefined behavior in the rand48(3) implementation Instead of implicid promotion to signed int, explicitly cast the arguments to unsigned int. >>> >>> Please, please, please, pay at least some attention to what is going >>> on around the code you are changing. >>> >>> If there's already code in this function that does: >>> >>>accu = (unsigned long) __rand48_mult[0] * (unsigned long) xseed[0]; >>> >>> then keep it consistent and don't do casts to a different type >>> >>>accu += (unsigned int) __rand48_mult[0] * (unsigned int) xseed[2]; >> >> cast to unsigned long still works, but changes algorithm. My change was >> performed deliberately. On the other hand and according to local tests >> the end-result for unsigned long produces the same reults as cast to >> unsigned int and unsigned char so it does not matter. > > I cannot make sense of your answer. Does the cast to unsigned long > there change the algorithm or does it produce the same result? If it > produces the same result, then it should be used to be consistent with > the rest of the code (or the rest of the code changed to use unsigned > int). If it does change the result, there should be a comment > explaining it. > > -uwe > Algorithm would be changed from calculating on 32bit numbers with signed integer overflows to an algorithm calculating on 64bit numbers. The __dorand48() function truncates the result to least significant 16bits only so it does not matter. I retained operations on 32bits avoiding changes of types for stylistic reasons. signature.asc Description: OpenPGP digital signature
Re: CVS commit: src/lib/libc/stdlib
On Sun, Feb 23, 2020 at 02:51:49 +0100, Kamil Rytarowski wrote: > On 23.02.2020 02:29, Valery Ushakov wrote: > > On Sat, Feb 22, 2020 at 14:07:57 +, Kamil Rytarowski wrote: > > > >> Module Name: src > >> Committed By: kamil > >> Date: Sat Feb 22 14:07:57 UTC 2020 > >> > >> Modified Files: > >>src/lib/libc/stdlib: _rand48.c > >> > >> Log Message: > >> Avoid undefined behavior in the rand48(3) implementation > >> > >> Instead of implicid promotion to signed int, > >> explicitly cast the arguments to unsigned int. > > > > Please, please, please, pay at least some attention to what is going > > on around the code you are changing. > > > > If there's already code in this function that does: > > > >accu = (unsigned long) __rand48_mult[0] * (unsigned long) xseed[0]; > > > > then keep it consistent and don't do casts to a different type > > > >accu += (unsigned int) __rand48_mult[0] * (unsigned int) xseed[2]; > > cast to unsigned long still works, but changes algorithm. My change was > performed deliberately. On the other hand and according to local tests > the end-result for unsigned long produces the same reults as cast to > unsigned int and unsigned char so it does not matter. I cannot make sense of your answer. Does the cast to unsigned long there change the algorithm or does it produce the same result? If it produces the same result, then it should be used to be consistent with the rest of the code (or the rest of the code changed to use unsigned int). If it does change the result, there should be a comment explaining it. -uwe
Re: CVS commit: src/lib/libc/stdlib
On 23.02.2020 02:29, Valery Ushakov wrote: > On Sat, Feb 22, 2020 at 14:07:57 +, Kamil Rytarowski wrote: > >> Module Name: src >> Committed By:kamil >> Date:Sat Feb 22 14:07:57 UTC 2020 >> >> Modified Files: >> src/lib/libc/stdlib: _rand48.c >> >> Log Message: >> Avoid undefined behavior in the rand48(3) implementation >> >> Instead of implicid promotion to signed int, >> explicitly cast the arguments to unsigned int. > > Please, please, please, pay at least some attention to what is going > on around the code you are changing. > > If there's already code in this function that does: > >accu = (unsigned long) __rand48_mult[0] * (unsigned long) xseed[0]; > > then keep it consistent and don't do casts to a different type > >accu += (unsigned int) __rand48_mult[0] * (unsigned int) xseed[2]; > > > -uwe > cast to unsigned long still works, but changes algorithm. My change was performed deliberately. On the other hand and according to local tests the end-result for unsigned long produces the same reults as cast to unsigned int and unsigned char so it does not matter. signature.asc Description: OpenPGP digital signature
Re: CVS commit: src/lib/libc/stdlib
On Sat, Feb 22, 2020 at 14:07:57 +, Kamil Rytarowski wrote: > Module Name: src > Committed By: kamil > Date: Sat Feb 22 14:07:57 UTC 2020 > > Modified Files: > src/lib/libc/stdlib: _rand48.c > > Log Message: > Avoid undefined behavior in the rand48(3) implementation > > Instead of implicid promotion to signed int, > explicitly cast the arguments to unsigned int. Please, please, please, pay at least some attention to what is going on around the code you are changing. If there's already code in this function that does: accu = (unsigned long) __rand48_mult[0] * (unsigned long) xseed[0]; then keep it consistent and don't do casts to a different type accu += (unsigned int) __rand48_mult[0] * (unsigned int) xseed[2]; -uwe
Re: CVS commit: src/lib/libc/gen
On 03.02.2020 17:24, Kamil Rytarowski wrote: > On 03.02.2020 17:21, Joerg Sonnenberger wrote: >> On Sat, Feb 01, 2020 at 03:38:46PM +, Kamil Rytarowski wrote: >>> Module Name:src >>> Committed By: kamil >>> Date: Sat Feb 1 15:38:46 UTC 2020 >>> >>> Modified Files: >>> src/lib/libc/gen: pthread_atfork.c >>> >>> Log Message: >>> Switch atform allocations from malloc()+free() to mmap()+munmap() >>> >>> This avoid bootstrapping malloc too early when libc+libpthread are not >>> ready. It is called through pthread__init() -> _pthread_atfork(). >>> >>> This also helps LLVM Leak Sanitizer to pacify false positive reports. >> >> Can we please stop adding more and more hacks for the still questionable >> "new" jemalloc and sit down for a sane init model first please? I'm >> still at the point to ask to just revert to the known-to-be-working >> version and therefore would also strongly prefer this change to be >> reverted. >> >> Joerg >> > > This change was reverted as there was a fallout. Init model is sane > except for pthread_atfork()+malloc() called prematurely in pthread__init(). > OK. The pthread_atfork(3) call was introduced with the following change. Maybe we should fix Python? commit 25c9c068b18fceac33cc6fec61a1277b5dc6107c Author: joerg Date: Wed Dec 25 00:44:45 2019 + Since pthread_setspecific requires locks, ensure that they are acquired before fork and dropped in both parent and child. At least Python depends on TSD after fork, even though it is undefined behavior in POSIX. diff --git a/lib/libpthread/pthread_tsd.c b/lib/libpthread/pthread_tsd.c index 666cbab85f1b..d6e261de0bb9 100644 --- a/lib/libpthread/pthread_tsd.c +++ b/lib/libpthread/pthread_tsd.c @@ -1,4 +1,4 @@ -/* $NetBSD: pthread_tsd.c,v 1.17 2019/03/05 01:35:52 christos Exp $ */ +/* $NetBSD: pthread_tsd.c,v 1.18 2019/12/25 00:44:45 joerg Exp $ */ /*- * Copyright (c) 2001, 2007 The NetBSD Foundation, Inc. @@ -30,7 +30,7 @@ */ #include -__RCSID("$NetBSD: pthread_tsd.c,v 1.17 2019/03/05 01:35:52 christos Exp $"); +__RCSID("$NetBSD: pthread_tsd.c,v 1.18 2019/12/25 00:44:45 joerg Exp $"); /* Functions and structures dealing with thread-specific data */ #include @@ -61,6 +61,18 @@ null_destructor(void *p) #include #include +static void +pthread_tsd_prefork(void) +{ + pthread_mutex_lock(_mutex); +} + +static void +pthread_tsd_postfork(void) +{ + pthread_mutex_unlock(_mutex); +} + void * pthread_tsd_init(size_t *tlen) { @@ -68,6 +80,8 @@ pthread_tsd_init(size_t *tlen) size_t alen; char *arena; + pthread_atfork(pthread_tsd_prefork, pthread_tsd_postfork, pthread_tsd_postfork); + if ((pkm = pthread__getenv("PTHREAD_KEYS_MAX")) != NULL) { pthread_keys_max = (int)strtol(pkm, NULL, 0); if (pthread_keys_max < _POSIX_THREAD_KEYS_MAX) signature.asc Description: OpenPGP digital signature
Re: CVS commit: src/lib/libc/gen
On 03.02.2020 17:21, Joerg Sonnenberger wrote: > On Sat, Feb 01, 2020 at 03:38:46PM +, Kamil Rytarowski wrote: >> Module Name: src >> Committed By:kamil >> Date:Sat Feb 1 15:38:46 UTC 2020 >> >> Modified Files: >> src/lib/libc/gen: pthread_atfork.c >> >> Log Message: >> Switch atform allocations from malloc()+free() to mmap()+munmap() >> >> This avoid bootstrapping malloc too early when libc+libpthread are not >> ready. It is called through pthread__init() -> _pthread_atfork(). >> >> This also helps LLVM Leak Sanitizer to pacify false positive reports. > > Can we please stop adding more and more hacks for the still questionable > "new" jemalloc and sit down for a sane init model first please? I'm > still at the point to ask to just revert to the known-to-be-working > version and therefore would also strongly prefer this change to be > reverted. > > Joerg > This change was reverted as there was a fallout. Init model is sane except for pthread_atfork()+malloc() called prematurely in pthread__init(). signature.asc Description: OpenPGP digital signature
Re: CVS commit: src/lib/libc/gen
On Sat, Feb 01, 2020 at 03:38:46PM +, Kamil Rytarowski wrote: > Module Name: src > Committed By: kamil > Date: Sat Feb 1 15:38:46 UTC 2020 > > Modified Files: > src/lib/libc/gen: pthread_atfork.c > > Log Message: > Switch atform allocations from malloc()+free() to mmap()+munmap() > > This avoid bootstrapping malloc too early when libc+libpthread are not > ready. It is called through pthread__init() -> _pthread_atfork(). > > This also helps LLVM Leak Sanitizer to pacify false positive reports. Can we please stop adding more and more hacks for the still questionable "new" jemalloc and sit down for a sane init model first please? I'm still at the point to ask to just revert to the known-to-be-working version and therefore would also strongly prefer this change to be reverted. Joerg
Re: CVS commit: src/lib/libc/sys
In article <20200104044017.c44aef...@cvs.netbsd.org>, Kamil Rytarowski wrote: >-=-=-=-=-=- > >Module Name: src >Committed By: kamil >Date: Sat Jan 4 04:40:17 UTC 2020 > >Modified Files: > src/lib/libc/sys: ptrace.2 > >Log Message: >/tmp/cvsbigmGa You need to read the commit message in, not pass the name of the file! christos
Re: CVS commit: src/lib/libc/sys
On 04.01.2020 05:40, Kamil Rytarowski wrote: > Module Name: src > Committed By: kamil > Date: Sat Jan 4 04:40:17 UTC 2020 > > Modified Files: > src/lib/libc/sys: ptrace.2 > > Log Message: > /tmp/cvsbigmGa > Document PT_LWPSTATUS and PT_LWPNEXT in ptrace(2) Remove mentions of obsolete PT_LWPINFO. > > To generate a diff of this commit: > cvs rdiff -u -r1.82 -r1.83 src/lib/libc/sys/ptrace.2 > > Please note that diffs are not public domain; they are subject to the > copyright notices on the relevant files. > > > Modified files: > > Index: src/lib/libc/sys/ptrace.2 > diff -u src/lib/libc/sys/ptrace.2:1.82 src/lib/libc/sys/ptrace.2:1.83 > --- src/lib/libc/sys/ptrace.2:1.82Wed Oct 9 14:20:47 2019 > +++ src/lib/libc/sys/ptrace.2 Sat Jan 4 04:40:17 2020 > @@ -1,7 +1,7 @@ > -.\" $NetBSD: ptrace.2,v 1.82 2019/10/09 14:20:47 wiz Exp $ > +.\" $NetBSD: ptrace.2,v 1.83 2020/01/04 04:40:17 kamil Exp $ > .\" > .\" This file is in the public domain. > -.Dd October 9, 2019 > +.Dd January 4, 2019 > .Dt PTRACE 2 > .Os > .Sh NAME > @@ -399,7 +399,7 @@ argument should contain the name of the > and the > .Fa data > argument should contain the length of the core filename. > -.It Dv PT_LWPINFO > +.It Dv PT_LWPSTATUS > Returns information about a thread from the list of threads for the > process specified in the > .Fa pid > @@ -407,41 +407,50 @@ argument. > The > .Fa addr > argument should contain a > -.Vt struct ptrace_lwpinfo > +.Vt struct ptrace_lwpstatus > defined as: > .Bd -literal -offset indent > -struct ptrace_lwpinfo { > +struct ptrace_lwpstatus { > lwpid_t pl_lwpid; > - int pl_event; > + sigset_t pl_sigpend; > + sigset_t pl_sigmask; > + char pl_name[20]; > + void *pl_private; > }; > .Ed > .Pp > where > .Fa pl_lwpid > contains a thread LWP ID. > -Information is returned for the thread following the one with the > +Information is returned for the thread specified in > +.Fa pl_lwpid . > +.Fa pl_sigpend > +contains the signals pending on that LWP. > +.Fa pl_sigmask > +contains the signals masked on that LWP. > +.Fa pl_name > +contains printable name of the LWP. > +The string is always NUL terminated. > +.Fa pl_private > +contains the pointer to TLS base. > +.Pp > +The > +.Fa data > +argument should contain > +.Dq Li sizeof(struct ptrace_lwpinfo) . > +.It Dv PT_LWPNEXT > +Is the same as > +.Dv PT_LWPSTATUS , > +except that information is returned for the thread following the one with the > specified ID in the process thread list, or for the first thread > if > .Fa pl_lwpid > is 0. > +.Pp > Upon return > .Fa pl_lwpid > contains the LWP ID of the thread that was found, or 0 if there is > no thread after the one whose LWP ID was supplied in the call. > -.Fa pl_event > -contains the event that stopped the thread. > -Possible values are: > -.Pp > -.Bl -tag -width 30n -offset indent -compact > -.It Dv PL_EVENT_NONE > -.It Dv PL_EVENT_SIGNAL > -.It Dv PL_EVENT_SUSPENDED > -.El > -.Pp > -The > -.Fa data > -argument should contain > -.Dq Li sizeof(struct ptrace_lwpinfo) . > .It Dv PT_SYSCALL > Stops a process before and after executing each system call. > Otherwise this operation is the same as > @@ -987,10 +996,3 @@ to > .Fn ptrace > .Ec , > should be able to sidestep this. > -.Pp > -.Dv PT_SET_SIGINFO , > -.Dv PT_RESUME > -and > -.Dv PT_SUSPEND > -can change the image of process returned by > -.Dv PT_LWPINFO . > signature.asc Description: OpenPGP digital signature
Re: CVS commit: src/lib/libc/tls
>>> Joerg Sonnenberger wrote > > On Fri, Nov 22, 2019 at 08:53:25AM +0900, Takeshi Nakayama wrote: > > >>> Takeshi Nakayama wrote > > > > > >>> Joerg Sonnenberger wrote > > > > > > > On Thu, Nov 21, 2019 at 11:06:16PM +, Takeshi Nakayama wrote: > > > > > Module Name: src > > > > > Committed By: nakayama > > > > > Date: Thu Nov 21 23:06:16 UTC 2019 > > > > > > > > > > Modified Files: > > > > > src/lib/libc/tls: tls.c > > > > > > > > > > Log Message: > > > > > Fix PR/54074 and PR/54093 completely. > > > > > > > > > > More similar to the ld.elf_so logic, it is necessary to align with > > > > > p_align first. Also, invert the #ifdef condition for consistency. > > > > > > > > This commit just wastes space without reason. > > > > > > No, without this commit the TLS offset calculation is mismatch if > > > alignof(max_align_t)) != p_align. > > > > Ah, this is wrong. > > Correctly, the TLS offset calculation is mismatch with what ld(1) > > expected if p_memsz is not aligned with p_align. > > For TLS variant I, it literally just adds padding at the end of the > allocation. For TLS variant II, it is redundant, because the rounding is > already done in with max_align_t. We do not support p_align > malloc > alignment and the patch is certainly nowhere near enough to change that. > It is actively harmful in that it can make people believe that it could > ever work. I don't want to do anything about the case of p_align > malloc alignment. I just want to fix an usual 32-bit sparc statically linked binaries not working properly. The new jemalloc now uses TLS, and the 32-bit sparc statically linked binary ELF header looks like this: % readelf -l rescue/sh | egrep 'MemSiz|TLS' Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align TLS0x6dc000 0x0075c000 0x0075c000 0x00b10 0x00b14 R 0x8 The point is when MemSiz (p_memsz) is not aligned with Align (p_align) as above. In tls.c rev 1.13: tls_size = p_memsz --> 0x00b14 tls_allocation = roundup2(tls_size, alignof(max_align_t)) --> 0x00b18 tcb = calloc() + tls_allocation --> calloc() + 0x00b18 p = tcb - tls_size --> calloc() + 4 memcpy(p, tls_initaddr, tls_initsize) So, the TLS initialization image is copied to calloc() + 4. However, ld(1) assumes that the TLS area starts with: tcb - roundup2(p_memsz, p_align) = tcb - 0x00b18 --> calloc() Since this is 4 bytes different, the initial value of the thread local variable is not set correctly. To fix this problem, I added the same thing as this line in ld.elf_so: https://nxr.netbsd.org/xref/src/libexec/ld.elf_so/tls.c#236 * obj->tlssize = p_memsz, obj->tlsalign = p_align -- Takeshi Nakayama