Re: CVS commit: src/lib/libc

2024-06-08 Thread Taylor R Campbell
> Date: Sat, 8 Jun 2024 11:51:43 +0200
> From: Roland Illig 
> 
> Am 07.06.2024 um 22:50 schrieb Taylor R Campbell:
> > libc: Pacify lint on aarch64.
> >
> > +++ src/lib/libc/stdlib/Makefile.incFri Jun  7 20:50:13 2024
> > +# lint(1) spuriously complains about `*s == CHAR_MAX' even though *s
> > +# has type char.
> > +LINTFLAGS.strfmon.c += -X 230
> 
> I guess the "spuriously" here means "on platforms where 'char' is
> unsigned", not "sometimes on the same platform, unpredictably".

Correct.

> Lint does not warn if the constant is defined as a character constant,
> so '\xff' instead of 0xff would work, but I don't know what else would
> be affected if the definition of CHAR_MAX were changed.

Might work but I'm reluctant to try without extensive testing on a lot
of compilers in a lot of environments.  E.g., does that work in C89?

> Ideally, lint would not warn about this expression, but since lint only
> looks at the preprocessed translation unit, it cannot know that the 0xff
> comes from CHAR_MAX and thus is fine.  Practically, suppressing the
> warning in this particular case makes sense.
> 
> Any ideas how to resolve this situation?

No brilliant ideas, sorry, other than to teach lint to track
provenance of constants through macro expansion.  That's why I just
disabled the warning for this case.


Re: CVS commit: src/lib/libc

2024-06-08 Thread Roland Illig
Am 07.06.2024 um 22:50 schrieb Taylor R Campbell:
> libc: Pacify lint on aarch64.
>
> +++ src/lib/libc/stdlib/Makefile.inc  Fri Jun  7 20:50:13 2024
> +# lint(1) spuriously complains about `*s == CHAR_MAX' even though *s
> +# has type char.
> +LINTFLAGS.strfmon.c += -X 230

I guess the "spuriously" here means "on platforms where 'char' is
unsigned", not "sometimes on the same platform, unpredictably".

When CHAR_MAX is defined as 0xff, lint assumes that the same constant
with the same value is used on other platforms as well, thus the
warning.  This is due to lint's -p flag for portability checks.

Lint does not warn if the constant is defined as a character constant,
so '\xff' instead of 0xff would work, but I don't know what else would
be affected if the definition of CHAR_MAX were changed.

Ideally, lint would not warn about this expression, but since lint only
looks at the preprocessed translation unit, it cannot know that the 0xff
comes from CHAR_MAX and thus is fine.  Practically, suppressing the
warning in this particular case makes sense.

Any ideas how to resolve this situation?

Roland



Re: CVS commit: src/lib/libc/time

2023-12-07 Thread Steffen Nurpmeso
Valery Ushakov wrote in
 :
 |On Fri, Dec 08, 2023 at 01:32:49 +0300, Valery Ushakov wrote:
 |
 |> On Thu, Dec 07, 2023 at 20:13:37 +, Robert Elz wrote:
 |>
 |>> While here, consistemntly use minus when minus is meant, rather that
 |>> just using a hyphen.
 |>
 |> One has to be careful with this.
 |
 |And to have this on record for refernce: https://lwn.net/Articles/947941/

'Could be you like that:

  https://lists.gnu.org/archive/html/groff/2022-09/msg00048.html

(or these

  https://lists.gnu.org/archive/html/groff/2022-09/msg00053.html
  https://lists.gnu.org/archive/html/groff/2022-09/msg00057.html

heck i could vomit all thread long. :-)

 |-uwe
 --End of 

Ciao.

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)
|
| Only in December: lightful Dubai COP28 Narendra Modi quote:
|  A small part of humanity has ruthlessly exploited nature.
|  But the entire humanity is bearing the cost of it,
|  especially the inhabitants of the Global South.
|  The selfishness of a few will lead the world into darkness,
|  not just for themselves but for the entire world.


Re: CVS commit: src/lib/libc/time

2023-12-07 Thread Valery Ushakov
On Fri, Dec 08, 2023 at 01:32:49 +0300, Valery Ushakov wrote:

> On Thu, Dec 07, 2023 at 20:13:37 +, Robert Elz wrote:
>
> > While here, consistemntly use minus when minus is meant, rather that
> > just using a hyphen.
>
> One has to be careful with this.

And to have this on record for refernce: https://lwn.net/Articles/947941/

-uwe


Re: CVS commit: src/lib/libc/time

2023-12-07 Thread Valery Ushakov
On Thu, Dec 07, 2023 at 20:13:37 +, Robert Elz wrote:

> While here, consistemntly use minus when minus is meant, rather that
> just using a hyphen.

One has to be careful with this.  In the literal context (Ql, Li, etc)
the ascii minus-hyphen in the input is preserved as such.  In other
contexts you will get a math minus sign in PS/PDF output, or groff
UTF-8 text output, which makes man page not copy-pastable.  And wide
(about the size of en-dash) minus sign looks rather ugly in non math
contexts like 02:00 CST (−06) in PS/PDF.

-uwe


Re: CVS commit: src/lib/libc/ssp

2023-11-14 Thread Jörg Sonnenberger
On Wednesday, November 15, 2023 4:15:28 AM CET you wrote:
> Module Name:  src
> Committed By: christos
> Date: Wed Nov 15 03:15:28 UTC 2023
> 
> Modified Files:
>   src/lib/libc/ssp: Makefile.inc
> Added Files:
>   src/lib/libc/ssp: ssp_redirect.c
> 
> Log Message:
> provide materialized functions for the ssp overriden inlines

The functions are supposed to be transparent and they used to be. Can we 
please just go back to the working state before? IMO wanting to overriding 
getcwd is absolutely no justification for this. If the prototype (and inline 
function) is visible from the header, userland should *not* be abled to 
interpose it. If it is not visible due to standard headers, there was no 
problem in first place.

Joerg




re: CVS commit: src/lib/libc/stdlib

2023-10-13 Thread matthew green
> Minor changes to jemalloc100 (the old one that only vax etc currently uses).

thanks.

i'm still using this version on a bunch of modern machines.

new jemalloc was problematic for a few things for me a
number of years ago and i keep meaning to test again, but
for now i'm still mostly using this version everwhere.

FYI.


.mrg.


Re: CVS commit: src/lib/libc/time

2022-10-26 Thread Robert Elz
Date:Wed, 26 Oct 2022 10:42:15 -0400
From:Jan Schaumann 
Message-ID:  

  | New proposal:

That looks better (it needs some minor wording changes,
there are too many indefinite articles ('a') which aren't
needed, and there midnight should be just that, no hyphen
or space -- but that's all trivia).

kre



Re: CVS commit: src/lib/libc/time

2022-10-26 Thread Jan Schaumann
Robert Elz  wrote:
> Date:Sun, 23 Oct 2022 13:53:01 -0400
> From:Jan Schaumann 
> Message-ID:  
> 
>   | Hmm, maybe something like this?
> 
> I think there is still too much there, you don't have
> to explain everything (or almost anything), but it is
> in the right direction I think.

New proposal:

---

[...] components are not restricted to their normal
ranges and will be normalized, if need be.

For example, consider a struct tm initialized with a
tm_year = 122, tm_mon = 10, tm_mday = 30, tm_hour =
22, tm_min = 57, and a tm_sec = 0.  Incrementing
tm_min by 13 and calling mktime() would lead to a
tm_hour = 23 and tm_min = 10.

This normalizing can lead to cascading changes: Again
using a struct tm initialized as in the above example
but with a tm_hour = 23, the same change would lead to
a tm_mon = 11, tm_mday = 1, tm_hour = 0, and tm_min =
10.

Negative values may also be normalized with similar
cascading effect such that e.g., a tm_hour of −1 means
1 hour before mid‐ night on the previous day and so
on.

---

?

-Jan


Re: CVS commit: src/lib/libc/time

2022-10-24 Thread Steffen Nurpmeso
Taylor R Campbell wrote in
 <20221023170035.2542f60...@jupiter.mumble.net>:
 ...
 |If you use a monotonic timer to sample the POSIX clock before and
 |after a leap second, the POSIX clock will appear to have taken twice
 |as long as it should to pass the leap second.

Just to note that the next leap second could be a negative one.

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)


Re: CVS commit: src/lib/libc/time

2022-10-24 Thread Robert Elz
Date:Sun, 23 Oct 2022 13:53:01 -0400
From:Jan Schaumann 
Message-ID:  

  | Hmm, maybe something like this?

I think there is still too much there, you don't have
to explain everything (or almost anything), but it is
in the right direction I think.

  | For example, consider a struct tm initialized with a
  | tm_year = 122, tm_mon = 11, tm_mday = 30, tm_hour =
  | 22, tm_min = 57, and a tm_sec = 0, using UTC,
  | representing 2022‐12‐31T22:57:00Z.

That last bit (the ISO format spec of the time) isn't
really needed, but doesn't hurt either - but I would avoid
an example that touches anywhere near midnight Dec 31, UTC,
(which this one does later, if not here) as that's exactly
where leap seconds start to appear, and (as illustrated in
the side-discussion in this thread) that's where things get messy.
Best to just avoid that  (Avoid June 30 for the same reason.)

  | Incrementing
  | tm_min by 13 and calling mktime() would yield a time_t
  | with value 1672528200,

That's irrelevant, the time_t returned isn't what's really
interesting here, and its binary value is way too much (useless)
information, it is the modification made to the tm that matters.
It is already clear enough in the doc (I think) that the result
from mktime() is the time_t for the normalised tm.

  | representing 2022‐12‐31T23:10:00Z,

this, or just give the adjusted tm_min and tm_hour
values, there's no need for both, and

  | as the tm_min = 70 was
  | normalized to an increment of tm_hour by one and
  | tm_min = 10.

no need for the explanation of how it was done.

  | This normalizing can lead to cascading changes: Again
  | using a struct tm initialized as in the above example
  | but with a tm_hour = 23, the same change would yield a
  | time_t with value 1672531800, representing
  | 2023‐01‐01T00:10:00Z

That's the adjustment we want to avoid, as it gets right into
what happens if we're observing leap seconds, and one was to
happen in that period.There's no need to show the year being
incremented, showing the month going up would be enough, readers
ought to be able to deduce that if the month changes from Dec to
Jan then the year would be incremented by one.

  | the normalization of tm_min incremented tm_hour,

This explanation is not needed, but if it were, that
would be correct, but

  | which lead to a normalization of tm_mday, which in
  | turn caused tm_mon to be normalized,

but not those.   Those fields (in this example) were
already within the appropriate range, they don't need
to be normalised, they're simply adjusted, or as in
this (or tm_hour above):

  | which in turn lead to the increment of tm_year.

  | In addition,

That's perhaps poor wording here, "addition" followed
immediately by "negative", it could be "Also" which
avoids this, but this lead in clause is not really
needed at all, just begin like:

  | negative values may also be normalized,
  | such that e.g., a tm_hour of −1 means 1 hour before
  | midnight, tm_mday of 0 means the day preceding the
  | current month, and tm_mon of −2 means 2 months before
  | January of tm_year.

Again, too much there, we don't need examples of everything.

I still feel though that an example with more than one adjustment
to a tm returned by localtime (though how it was originally
created is irrelevant) and which would affect Feb 29 were
it a leap year, would be worth giving (in both the leap
year and non leap year cases) so we get Feb 29 in one
case and Mar 1 in the other, from an adjustment that
affects months (and days, either directly, or as a flow on
from hours, mins, or secs) - like going back a month from
Mar 28, then forward to the next day, and sometimes still
being in Mar.

  | The fact that mktime(2) returns a time_t makes the
  | phrasing even more awkward,

Yes, as above.

  | and I suppose we could
  | leave out the actual value and say "would yield a
  | time_t representing..."?

No, nothing about the result returned at all, this should
all just be about what normalising the tm causes to happen.
Those values are not just internal, the tm passed (via ref)
is modified if needed before mktime returns.

kre



Re: CVS commit: src/lib/libc/time

2022-10-23 Thread Warner Losh
On Sun, Oct 23, 2022 at 11:00 AM Taylor R Campbell <
campbell+netbsd-source-change...@mumble.net> wrote:

> > Date: Sun, 23 Oct 2022 07:39:25 -0600
> > From: Warner Losh 
> >
> > I guess a more accurate way of saying this is that leap seconds simply
> > aren't reliable, cannot be made reliable, and this affects normalization
> > in ways too numerous to mention due to the details of the tz files, bugs
> > in the system, and lack of others to implement them correctly.
>
> I think you mean `POSIX clocks simply aren't reliable'.  They _could_
> be made reliable, though, by fixing the the calendar arithmetic
> formula in POSIX for mapping time_t to and from UTC -- just like POSIX
> already fixed the bug in its Gregorian leap year formula.
>

Except they can't, at least not practically enough to be a standard. The
Gregorian Leap Year formula is a mathematical formula that needs no
further data other than the broken down time to compute. It's not an
observational calendar, but a computational or arithmetic one. UTC is an
observational calendar. We barely know if there's going to be a leap
second in the coming months, and have nothing more than a vague notion
of when the one after that might be. You must have a table of all past
leap seconds to do any kind of sensible mapping. And you also must
have some way to keep that up to date, even when machines are
powered off, or installed from not really that old media (anything older
than 6 months can't possibly have the right leap table, except by
chance). And then the question becomes how do you get it, do you
assume connectivity, some standard media format, some standard file
format, etc. All of these details means POSIX can't really fix this. And
even if they do, the current formula has been around so long there's a
lot of dusty decks of code that will likely silently break. You can
ameliorate
that somewhat by inventing new interfaces, but issues like the one you go
into below will still persist.


> > > The code works with either set of tzdata files, POSIX stretchy secs,
> > > or UTC with leap secs - claiming that one doesn't happen, or cannot
> > > happen, isn't really correct.
> >
> > Yea, and even 'posix stretchy sec' is really a misnomer. POSIX simply
> > counts time without leap seconds. Each second is the same length,
>
> Not at all.
>
> If you use a monotonic timer to sample the POSIX clock before and
> after a leap second, the POSIX clock will appear to have taken twice
> as long as it should to pass the leap second.
>
> Of course, it's worse.  If sampled at _subsecond_ intervals, a POSIX
> clock behaves _erratically_: it spontaneously rewinds itself!
>
> Suppose we have a machine with a monotonic clock that counts SI
> seconds as well as a POSIX clock:
>
> SI monotonicPOSIX
> 123.25  1483228799.00
> 123.50  1483228799.25
> 123.75  1483228799.50  # t0 = boottime + 123.75
> 124.00  1483228799.75
> 124.25  1483228800.00  # leap second begins at 2016-12-31T23:59:60Z
> 124.50  1483228800.25
> 124.75  1483228800.50
> 125.00  1483228800.75  # t1 = boottime + 125.00
> 125.25  1483228800.00  # POSIX clock rewinds at
> 2017-01-01T00:00:00Z!
> 125.50  1483228800.25
> 125.75  1483228800.50  # t2 = boottime + 125.75
> 126.00  1483228800.75
>
> At supersecond resolution, t2 - t0 is a duration of 2 SI seconds, but
> a POSIX clock reports a time difference POSIX(t2) - POSIX(t0) of 1, so
> `POSIX seconds' are not always SI seconds -- it is not the case that
> `each [POSIX] second is the same length', even ignoring physical clock
> sampling error.
>

Right. Except during that brief interval around a leap second, all the
seconds
are the same size. They aren't expanded or contracted by running the clock
at a different frequency that was done between approx 1960-1972 which is
often referred to as the rubber leap second era. It was more on that basis
that
I was objecting to the turn of phrase because that sort of thing isn't
happening.


> And, of course, at subsecond resolution, the POSIX clock rewinds.  The
> monotonic clock correctly has t1 < t2, but POSIX(t1) > POSIX(t2).  And
> this erratic behaviour is much worse than a typical NTP-driven clock
> adjustment at random times, because by design this erratic behaviour
> happens on ~every computer on the planet simultaneously!
>

Yea, if NTP knows about the leap, it can deal with it. The problem as you
say comes in when the stratum 1 servers don't announce the leap second
far enough in advance for the implementations to cope. It then devolves to
the 'when, exactly, do you step the time back' problem since there's a
couple
of choices, unless you have the 'leap smear' ntp servers which do it over
a few hours.


> There's no need for this nonsense except insistence on the formula
> that says every UTC day is counted by 86400 `POSIX seconds'.  POSIX
> could be revised to fix this bug in the clock by just not doing civil
> calendar 

Re: CVS commit: src/lib/libc/time

2022-10-23 Thread Taylor R Campbell
> Date: Sun, 23 Oct 2022 12:19:37 -0600
> From: Warner Losh 
> 
>  You must have a table of all past
> leap seconds to do any kind of sensible mapping. And you also must
> have some way to keep that up to date, [...]

It's the same for all civil calendar matters with political changes
around summer time, but at a smaller scale: with an outdated tzdb,
where localtime might be off for an hour, mktime might be off by a few
seconds.

> > And, of course, at subsecond resolution, the POSIX clock rewinds.  The
> > monotonic clock correctly has t1 < t2, but POSIX(t1) > POSIX(t2).  And
> > this erratic behaviour is much worse than a typical NTP-driven clock
> > adjustment at random times, because by design this erratic behaviour
> > happens on ~every computer on the planet simultaneously!
> 
> Yea, if NTP knows about the leap, it can deal with it. The problem as you
> say comes in when the stratum 1 servers don't announce the leap second
> far enough in advance for the implementations to cope. It then devolves to
> the 'when, exactly, do you step the time back' problem since there's a
> couple
> of choices, unless you have the 'leap smear' ntp servers which do it over
> a few hours.

I mentioned NTP-driven clock adjustments as an example of _other_
times that a POSIX clock might rewind.  Such adjustments arise from
local clock drift and so occur randomly, in contrast to the rewinding
on leap seconds that is globally coordinated as a deliberate design
decision of POSIX leading to simultaneous worldwide computer system
failures.

NTP doesn't have to know about leap seconds at all, nor does POSIX.
They could both just report the number of SI seconds (averaged in the
usual TAI way) since an appropriate epoch, and all the leap second
adjustment logic could be ripped out and moved to tzdb and mktime just
like time zones and summer time.

The fundamental problem with a POSIX clock -- and to a lesser extent
NTP, although NTP at least transmits additional information to work
around it during a leap second transition unlike POSIX programs -- is
that it goes out of its way to step back the _internal concept of
timecounting_ (tick-tick-tick, counting what everyone expects to be SI
seconds) just to deal with an _external civil calendar presentation_,
like the summer time change.  And it makes sure to do so
simultaneously on all computers worldwide.  That's a spectacularly bad
design leading to tremendously costly mistakes.  Leap seconds aren't a
big deal; the erratic design of POSIX clocks is.


Re: CVS commit: src/lib/libc/time

2022-10-23 Thread Jan Schaumann
Robert Elz  wrote:
> Date:Sat, 22 Oct 2022 13:42:54 -0400
> From:Jan Schaumann 
> Message-ID:  
> 
>   | A full set of examples strikes me as too much here
> 
> A full set would be yes, but I think we could handle 3.

Hmm, maybe something like this?


...and will be normalized, if need be.

For example, consider a struct tm initialized with a
tm_year = 122, tm_mon = 11, tm_mday = 30, tm_hour =
22, tm_min = 57, and a tm_sec = 0, using UTC,
representing 2022‐12‐31T22:57:00Z.  Incrementing
tm_min by 13 and calling mktime() would yield a time_t
with value 1672528200, representing
2022‐12‐31T23:10:00Z, as the tm_min = 70 was
normalized to an increment of tm_hour by one and
tm_min = 10.

This normalizing can lead to cascading changes: Again
using a struct tm initialized as in the above example
but with a tm_hour = 23, the same change would yield a
time_t with value 1672531800, representing
2023‐01‐01T00:10:00Z: the normalization of tm_min
incremented tm_hour, which lead to a normalization
of tm_mday, which in turn caused tm_mon to be
normalized, which in turn lead to the increment of
tm_year.

In addition, negative values may also be normalized,
such that e.g., a tm_hour of −1 means 1 hour before
midnight, tm_mday of 0 means the day preceding the
current month, and tm_mon of −2 means 2 months before
January of tm_year.
---

The fact that mktime(2) returns a time_t makes the
phrasing even more awkward, and I suppose we could
leave out the actual value and say "would yield a
time_t representing..."?

Other ideas to improve this?

-Jan


Re: CVS commit: src/lib/libc/time

2022-10-23 Thread Taylor R Campbell
> Date: Sun, 23 Oct 2022 07:39:25 -0600
> From: Warner Losh 
> 
> I guess a more accurate way of saying this is that leap seconds simply
> aren't reliable, cannot be made reliable, and this affects normalization
> in ways too numerous to mention due to the details of the tz files, bugs
> in the system, and lack of others to implement them correctly.

I think you mean `POSIX clocks simply aren't reliable'.  They _could_
be made reliable, though, by fixing the the calendar arithmetic
formula in POSIX for mapping time_t to and from UTC -- just like POSIX
already fixed the bug in its Gregorian leap year formula.

> > The code works with either set of tzdata files, POSIX stretchy secs,
> > or UTC with leap secs - claiming that one doesn't happen, or cannot
> > happen, isn't really correct.
> 
> Yea, and even 'posix stretchy sec' is really a misnomer. POSIX simply
> counts time without leap seconds. Each second is the same length,

Not at all.

If you use a monotonic timer to sample the POSIX clock before and
after a leap second, the POSIX clock will appear to have taken twice
as long as it should to pass the leap second.

Of course, it's worse.  If sampled at _subsecond_ intervals, a POSIX
clock behaves _erratically_: it spontaneously rewinds itself!

Suppose we have a machine with a monotonic clock that counts SI
seconds as well as a POSIX clock:

SI monotonicPOSIX
123.25  1483228799.00
123.50  1483228799.25
123.75  1483228799.50  # t0 = boottime + 123.75
124.00  1483228799.75
124.25  1483228800.00  # leap second begins at 2016-12-31T23:59:60Z
124.50  1483228800.25
124.75  1483228800.50
125.00  1483228800.75  # t1 = boottime + 125.00
125.25  1483228800.00  # POSIX clock rewinds at 2017-01-01T00:00:00Z!
125.50  1483228800.25
125.75  1483228800.50  # t2 = boottime + 125.75
126.00  1483228800.75

At supersecond resolution, t2 - t0 is a duration of 2 SI seconds, but
a POSIX clock reports a time difference POSIX(t2) - POSIX(t0) of 1, so
`POSIX seconds' are not always SI seconds -- it is not the case that
`each [POSIX] second is the same length', even ignoring physical clock
sampling error.

And, of course, at subsecond resolution, the POSIX clock rewinds.  The
monotonic clock correctly has t1 < t2, but POSIX(t1) > POSIX(t2).  And
this erratic behaviour is much worse than a typical NTP-driven clock
adjustment at random times, because by design this erratic behaviour
happens on ~every computer on the planet simultaneously!

There's no need for this nonsense except insistence on the formula
that says every UTC day is counted by 86400 `POSIX seconds'.  POSIX
could be revised to fix this bug in the clock by just not doing civil
calendar adjustments in the basic clock that goes tick-tick-tick for
counting what most people think are going to be SI seconds.  For the
time_t<->UTC conversion in libc, machines with out-of-date tzdb would
just be off by a few seconds sometimes, no worse than being off by an
hour in the time_t<->localtime conversion with an out-of-date tzdb
across an updated summer time change.


Re: CVS commit: src/lib/libc/time

2022-10-23 Thread Warner Losh
On Sat, Oct 22, 2022 at 10:55 PM Robert Elz  wrote:

> Date:Sat, 22 Oct 2022 20:17:57 -0600
> From:Warner Losh 
> Message-ID:  <
> canczdfqhkz0xs7lf6lmjveontc5dgsonons_ug6fzsf30og...@mail.gmail.com>
>
>
>   | On the other hand, adding a caveat that leap seconds don't exist in a
> POSIX
>   | time_t and that's necessarily reflected in the normalization wouldn't
> be
>   | bad.
>
> The problem with doing that is that while we don't install it this
> way, it is possible to install tzdata such that leap seconds do occur
> (despite that not being posix) either unconditionally, or installing
> both and leaving it up to the user by their choice of timezone to use
> (which makes rather a mess).
>

That's true. In the default "UTC system time" clock, leap seconds simply
do no work at all. With the tzdata with leap seconds, they work for a
limited
subset of things, but not quite everything because you can't completely
close the gap due to time_t not having an encoding for them. Having tried
to deploy systems that needed to present a pedantically-correct UTC
time to the outside world shows many hole (most of which self correct
and are mostly right not near leap seconds).

I guess a more accurate way of saying this is that leap seconds simply
aren't reliable, cannot be made reliable, and this affects normalization
in ways too numerous to mention due to the details of the tz files, bugs
in the system, and lack of others to implement them correctly.


> The code works with either set of tzdata files, POSIX stretchy secs,
> or UTC with leap secs - claiming that one doesn't happen, or cannot
> happen, isn't really correct.
>

Yea, and even 'posix stretchy sec' is really a misnomer. POSIX simply
counts time without leap seconds. Each second is the same length,
and it doesn't make them longer or shorter. It just assumes some
ad-hoc, beyond the standard method for synchronization. The quality
of these ad-hoc methods ranges from terrible to halfway-decent, but
none are perfect.

OK. I've got my annual leap seconds rant out of the way. Even 15 years
after shipping these systems, the almost (but not completely) right nature
of how computers implement them still has my knee jerking just a little
too much...

Warner


Re: CVS commit: src/lib/libc/time

2022-10-23 Thread Warner Losh
On Sat, Oct 22, 2022, 8:05 PM Robert Elz  wrote:

> Date:Sat, 22 Oct 2022 13:42:54 -0400
> From:Jan Schaumann 
> Message-ID:  
>
>   | A full set of examples strikes me as too much here
>
> A full set would be yes, but I think we could handle 3.
> They don't need to be C code examples, just something
> like
> if tm_year==2022 tm_mon==10 tm_mday==19
>tm_hour==23 tm_min==58 tm_sec==17 and tm_dst==-1
>
> and tm_min was incremented as tm_min+=180, and then
> mktime() called upon the result, the struct tm would
> now have tm_min==1 tm_hour==24 and tm_mday==20, with
> tm_isdst 0 or 1 depending upon the local time zone.
> The other fields shown would not be altered in this
> example, the fields of struct tm not show here would
> (tm_wday, ...) would be filled in with appropriate values.
>

That sounds like a good idea...

[Using tm_min rather than tm_sec to avoid needing to include
> caveats about possible leap seconds, which, at least currently,
> could never happen in November, but all of that is too much
> to bother with.]
>

On the other hand, adding a caveat that leap seconds don't exist in a POSIX
time_t and that's necessarily reflected in the normalization wouldn't be
bad. It's a quite common misconception that somehow they do because they
exist in UTC... while you can encode any valid UTC time, they are always
normalized away.

The other examples could be even shorter, no need to repeat
> the final sentence about the other field changes or otherwise.
>
>   | Perhaps:
>   |
>   | This normalization is done in order from tm_sec up to
>   | tm_year,
>
> No, that is what you cannot say, because it is wrong.
>
> We do not always start from tm_sec, and tm_mon is *always*
> normalised (with its possible flow on to tm_year) before
> tm_mday - nothing else makes sense.   tm_year isn't
> "normalized" at all, though it might be adjusted.
>
> But all of this is what we already agreed is too much
> to explain.   And no matter how simple it makes things,
> the man pages cannot (deliberately) lie.
>

Agreed. These nuances are important.

  | possibly leading to cascading values.  That
>   | is, a tm_sec value of e.g., 185 with a tm_min value of
>   | 58 could lead to an increment of tm_min by three, and
>   | thus further incrementing tm_hour by one, and so on.
>
> This is more explaining how it works, which if we do at all,
> we need to do correctly.   That's where a few examples can
> be better, they allow the reader to deduce what probably
> happens, the weird one is there to show that the results
> will not always be what is expected at first glance, but
> that they always are correct.
>
>   | As with most things relating to
>   | time, dates, and calendars, the full details of these
>   | side effects are often non‐obvious, and it may be best
>   | to avoid such scenarios.
>
> And that part is just useless, unless you mean that non-normalised
> values should never be used, perhaps just in some fields (and it
> isn't that anyway, a big enough increment of tm_sec is the same
> as an increment of tm_mday or tm_mon) in which case we should just
> say that.   If not that, then hinting that it is sometimes odd,
> and those cases, which ones we refuse to tell you, should be
> avoided is a complete waste of time.   Which scenarios should be
> avoided?   The answer really is none of them, the answer is
> always correct, it just might not be what the user expected.
>


All good points.

Warner

> kre
>
>


Re: CVS commit: src/lib/libc/time

2022-10-22 Thread Robert Elz
Date:Sat, 22 Oct 2022 20:17:57 -0600
From:Warner Losh 
Message-ID:  



  | On the other hand, adding a caveat that leap seconds don't exist in a POSIX
  | time_t and that's necessarily reflected in the normalization wouldn't be
  | bad.

The problem with doing that is that while we don't install it this
way, it is possible to install tzdata such that leap seconds do occur
(despite that not being posix) either unconditionally, or installing
both and leaving it up to the user by their choice of timezone to use
(which makes rather a mess).

The code works with either set of tzdata files, POSIX stretchy secs,
or UTC with leap secs - claiming that one doesn't happen, or cannot
happen, isn't really correct.

kre




Re: CVS commit: src/lib/libc/time

2022-10-22 Thread Robert Elz
Date:Sun, 23 Oct 2022 08:33:18 +0700
From:Robert Elz 
Message-ID:  <7638.1666488...@jacaranda.noi.kre.to>

  | and tm_min was incremented as tm_min+=180, and then
  | mktime() called upon the result, the struct tm would
  | now have tm_min==1 tm_hour==24 and tm_mday==20, with

That is nonsense of course, but I think you get the idea
(and being nonsense shows why the examples should be
tested )

kre


Re: CVS commit: src/lib/libc/time

2022-10-22 Thread Robert Elz
Date:Sat, 22 Oct 2022 13:42:54 -0400
From:Jan Schaumann 
Message-ID:  

  | A full set of examples strikes me as too much here

A full set would be yes, but I think we could handle 3.
They don't need to be C code examples, just something
like
if tm_year==2022 tm_mon==10 tm_mday==19
   tm_hour==23 tm_min==58 tm_sec==17 and tm_dst==-1

and tm_min was incremented as tm_min+=180, and then
mktime() called upon the result, the struct tm would
now have tm_min==1 tm_hour==24 and tm_mday==20, with
tm_isdst 0 or 1 depending upon the local time zone.
The other fields shown would not be altered in this
example, the fields of struct tm not show here would
(tm_wday, ...) would be filled in with appropriate values.

[Using tm_min rather than tm_sec to avoid needing to include
caveats about possible leap seconds, which, at least currently,
could never happen in November, but all of that is too much
to bother with.]

The other examples could be even shorter, no need to repeat
the final sentence about the other field changes or otherwise.

  | Perhaps:
  |
  | This normalization is done in order from tm_sec up to
  | tm_year,

No, that is what you cannot say, because it is wrong.

We do not always start from tm_sec, and tm_mon is *always*
normalised (with its possible flow on to tm_year) before
tm_mday - nothing else makes sense.   tm_year isn't
"normalized" at all, though it might be adjusted.

But all of this is what we already agreed is too much
to explain.   And no matter how simple it makes things,
the man pages cannot (deliberately) lie.

  | possibly leading to cascading values.  That
  | is, a tm_sec value of e.g., 185 with a tm_min value of
  | 58 could lead to an increment of tm_min by three, and
  | thus further incrementing tm_hour by one, and so on.

This is more explaining how it works, which if we do at all,
we need to do correctly.   That's where a few examples can
be better, they allow the reader to deduce what probably
happens, the weird one is there to show that the results
will not always be what is expected at first glance, but
that they always are correct.

  | As with most things relating to
  | time, dates, and calendars, the full details of these
  | side effects are often non‐obvious, and it may be best
  | to avoid such scenarios.

And that part is just useless, unless you mean that non-normalised
values should never be used, perhaps just in some fields (and it
isn't that anyway, a big enough increment of tm_sec is the same
as an increment of tm_mday or tm_mon) in which case we should just
say that.   If not that, then hinting that it is sometimes odd,
and those cases, which ones we refuse to tell you, should be
avoided is a complete waste of time.   Which scenarios should be
avoided?   The answer really is none of them, the answer is
always correct, it just might not be what the user expected.

kre



Re: CVS commit: src/lib/libc/time

2022-10-22 Thread Jan Schaumann
Robert Elz  wrote:
> jo...@bec.de said:
>   | I'd actually weasle out a bit
> 
> Yes, I would too, but

A full set of examples strikes me as too much here
still, and I do like the idea of weaseling out.

Perhaps:

This normalization is done in order from tm_sec up to
tm_year, possibly leading to cascading values.  That
is, a tm_sec value of e.g., 185 with a tm_min value of
58 could lead to an increment of tm_min by three, and
thus further incrementing tm_hour by one, and so on.
Likewise, negative values can lead to decrementing
other tm(3) fields.  As with most things relating to
time, dates, and calendars, the full details of these
side effects are often non‐obvious, and it may be best
to avoid such scenarios.

?

-Jan


Re: CVS commit: src/lib/libc/time

2022-10-22 Thread Robert Elz
Date:Fri, 21 Oct 2022 15:00:41 -0400
From:Jan Schaumann 
Message-ID:  

  | I believe that it's useful for people to know that
  | values outside the normal ranges are normalized,

Agreed.

  | as well as what that might look like.

The problem with trying to specify that, as we have
discovered, is that it is not easy.   The problem with
only doing half of the job here, is that sometime later
someone will read it, see what it says happens, and
what it doesn't say, and then expect the code to do
that - when it probably doesn't.

I'll come back to the rest of your message a bit lower,
but first a brief interlude ...

jo...@bec.de said:
  | I'd actually weasle out a bit

Yes, I would too, but

jo...@bec.de continued:
  | and just declare that normalisation is best effort and we
  | don't guarantee behavior for values that are very much outside
  | the range of the corresponding field. E.g. anything where
  | numerical overflow can happen in a component.

that isn't really the problem.  tzcode is very careful with arithmetic
to avoid those kinds of problems - if there's anywhere that it isn't
(which is still possible) that would be fixed if pointed out.

The problem is just the complexity of specifying what does happen,
in a way that fits the manual page in a way that is easier to
follow that simply reading the source (which isn't all that easy).

Now we return to the originally scheduelled e-mail:

jscha...@netmeister.org said:
  | That helps a lot when trying to determine _why_ the
  | output of mktime(3) [...]

  | Noting that values may cascade is useful [...]

I wonder if rather than attempting to specify what happens,
it might be better in this case to give examples of using
struct tm alterations, followed by mktime(), to alter the time,
and what the results are.

I'm not generally much in favour of examples in man pages,
I prefer specification, and leave it up to the reader to
determine how to use things to achieve the desired result,
as while giving an example or two (or three) of what can be
done is great for people who want to do one of those exact
things (and so can just copy the example) but is useless for
someone whose needs are different than any of the examples,
and is left trying to guess what will happen.

But here a two or three good examples of what happens might
just be a sane solution, perhaps two fairly simple ones, but
which demonstrate what happens with addition, and subtraction
- using just one field modification, but having that result in
a cascade through a few of the other fields - perhaps the 180
seconds forward, starting at 23:58 on the last day of some
month, and 32 hours backwards starting at 01:15 on the second
of a month.Those should both produce entirely reasonable
and understandable results.

The third should show multiple modifications, in a way that produces,
or can do, a surprising result, perhaps 14 days forward, also
2 months forward, and some irrelevant small number of minutes backward
(that just to show that multiple fields can be altered, and they don't
need to all go in the same direction) starting at 13:27 on Dec 15 - where
the result can be Feb 29, or Mar 1 (at something a bit earlier than 13:27,
perhaps even make the number of minutes subtracted > 27 in that example)
depending upon whether the year is a leap year or not.   (Or make it
6 months forward (plus...) starting from August).  This one should be
one full example, with something like "if the starting year had been
 then the result would have been" added to show the difference
between leap years and others.

If this is done, the examples should all be tested to verify the
claimed results, but the man page should not need a lot of explanation
of why the result shown is achieved, just the input, the modification
made, and the result.

kre



Re: CVS commit: src/lib/libc/time

2022-10-21 Thread Joerg Sonnenberger
Am Fri, Oct 21, 2022 at 02:06:32PM +0700 schrieb Robert Elz:
> Date:Fri, 21 Oct 2022 03:05:15 +
> From:"Jan Schaumann" 
> Message-ID:  <20221021030515.cdc9df...@cvs.netbsd.org>
> 
>   | Note normalizing behavior of mktime(3) using language from FreeBSD.
> 
> If we are going to start specifying what happens (how the struct tm
> is normalised) then we really also need to specify in which order
> the fields are corrected.   It makes a difference.

I'd actually weasle out a bit and just declare that normalisation is
best effort and we don't guarantee behavior for values that are very
much outside the range of the corresponding field. E.g. anything where
numerical overflow can happen in a component.

Joerg


Re: CVS commit: src/lib/libc/time

2022-10-21 Thread Jan Schaumann
Robert Elz  wrote:
> Date:Fri, 21 Oct 2022 11:54:08 -0400
> From:Jan Schaumann 
> Message-ID:  
> 
>   | Right, but all that strikes me as too much to put into
>   | the manual page here.
> 
> I agree, which is why I wondered if we should be giving any
> details about how it is done at all.

I believe that it's useful for people to know that
values outside the normal ranges are normalized, as
well as what that might look like.

That helps a lot when trying to determine _why_ the
output of mktime(3) with a tm_sec = 180 yields no
error and a value of tm_min += 3.  I don't think many
people would want to go and dig up the sources to make
sense of this.

Noting that values may cascade is useful (because
that, too, is not obvious), but seems to me the limit
of detail desired here, while still being useful.

-Jan


Re: CVS commit: src/lib/libc/time

2022-10-21 Thread Robert Elz
Date:Fri, 21 Oct 2022 11:54:08 -0400
From:Jan Schaumann 
Message-ID:  

  | Right, but all that strikes me as too much to put into
  | the manual page here.

I agree, which is why I wondered if we should be giving any
details about how it is done at all.

  | I think what matters in this context is that there is
  | an order and that there is a cascading effect.

Perhaps, but

  | "This normalization is done in order from tm_sec to
  | tm_mon (inclusive), possibly leading to cascading
  | values."

As a generic order, that's backwards.   Really we need to
go from year (which is always OK) to month (which can always
be corrected simply, assuming the Gregorian calendar is in
effect, which we assume it is, even for years before that
calendar was created).   Once we know the year and month,
but not before, the day of month can be corrected (which
might then affect the month and year, but only once).

The code doesn't do it in quite that order, it does the
secs mins hours first, but doing it that way ensures that
the hours correction cannot affect the day, after it has
been corrected, the day(of month) always gets set last
(leap seconds excepted).

It really is moderately complicated - trying to explain it
at all risks being simply wrong, or so vague as to be
completely useless.I'm not sure that what was there
before (no details at all, just (in other words) "it gets
fixed") wasn't as good as is reasonable to expect.

kre



Re: CVS commit: src/lib/libc/time

2022-10-21 Thread Jan Schaumann
Robert Elz  wrote:
> Date:Fri, 21 Oct 2022 10:36:23 -0400
> From:Jan Schaumann 
> Message-ID:  
> 
> 
>   | Perhaps like this?
> 
> Like that, yes, but as you wrote it isn't how it is actually
> done I believe.
> 
> The order looks to be secs, mins, hours, month, day(of month).
> 
> There's no normalisation of the year (no invalid values to correct)
> though the year can be adjusted when the month is adjusted.
> 
> That order is assuming that the system is using posix stretchy seconds,
> rather than UTC (with leap seconds), but almost all NetBSD systems
> work that way - if UTC is being used, then secs cannot be done first,
> as until the rest of the data/time is known, it is unknown whether
> the number of seconds permitted is 59 60 or 61.   With POSIX seconds
> the secs/mins/hours correction order doesn't really matter, as there
> are always 0..59 secs, 0..59 mins, and 0..23 hours, the rest of the
> fields have no impact at all.

Right, but all that strikes me as too much to put into
the manual page here.  I think what matters in this
context is that there is an order and that there is a
cascading effect.

"This normalization is done in order from tm_sec to
tm_mon (inclusive), possibly leading to cascading
values."

or

"This normalization is done in order from tm_sec up to
tm_year, possibly leading to cascading values."

provides sufficient information, I think?

-Jan


Re: CVS commit: src/lib/libc/time

2022-10-21 Thread Robert Elz
Date:Fri, 21 Oct 2022 10:36:23 -0400
From:Jan Schaumann 
Message-ID:  


  | Perhaps like this?

Like that, yes, but as you wrote it isn't how it is actually
done I believe.

The order looks to be secs, mins, hours, month, day(of month).

There's no normalisation of the year (no invalid values to correct)
though the year can be adjusted when the month is adjusted.

That order is assuming that the system is using posix stretchy seconds,
rather than UTC (with leap seconds), but almost all NetBSD systems
work that way - if UTC is being used, then secs cannot be done first,
as until the rest of the data/time is known, it is unknown whether
the number of seconds permitted is 59 60 or 61.   With POSIX seconds
the secs/mins/hours correction order doesn't really matter, as there
are always 0..59 secs, 0..59 mins, and 0..23 hours, the rest of the
fields have no impact at all.

kre



Re: CVS commit: src/lib/libc/time

2022-10-21 Thread Jan Schaumann
Robert Elz  wrote:
> Date:Fri, 21 Oct 2022 03:05:15 +
> From:"Jan Schaumann" 
> Message-ID:  <20221021030515.cdc9df...@cvs.netbsd.org>
> 
>   | Note normalizing behavior of mktime(3) using language from FreeBSD.
> 
> If we are going to start specifying what happens (how the struct tm
> is normalised) then we really also need to specify in which order
> the fields are corrected.   It makes a difference.

Perhaps like this?


Index: ctime.3
===
RCS file: /cvsroot/src/lib/libc/time/ctime.3,v
retrieving revision 1.65
diff -u -p -r1.65 ctime.3
--- ctime.3 21 Oct 2022 03:08:29 -  1.65
+++ ctime.3 21 Oct 2022 14:34:39 -
@@ -264,6 +264,11 @@ of 0 means the day preceding the current
 .Fa tm_mon
 of \-2 means 2 months before January of
 .Fa tm_year .
+This normalization is done in order from
+.Fa tm_sec
+to
+.Fa tm_year ,
+possibly leading to cascading values.
 (A positive or zero value for
 .Fa tm_isdst
 causes


Re: CVS commit: src/lib/libc/time

2022-10-21 Thread Robert Elz
Date:Fri, 21 Oct 2022 03:05:15 +
From:"Jan Schaumann" 
Message-ID:  <20221021030515.cdc9df...@cvs.netbsd.org>

  | Note normalizing behavior of mktime(3) using language from FreeBSD.

If we are going to start specifying what happens (how the struct tm
is normalised) then we really also need to specify in which order
the fields are corrected.   It makes a difference.

kre



Re: CVS commit: src/lib/libc/stdlib

2022-09-29 Thread Robert Elz
Date:Thu, 29 Sep 2022 08:18:49 -0400
From:Christos Zoulas 
Message-ID:  

  | Yes, I had forgotten about the need to do this explicitly...

  | > On Sep 28, 2022, at 10:23 PM, Robert Elz  wrote:
  | > 
  | > Apologies, I did not read the code closely enough, there must be a bug
  | > in the way the clone file descriptor is created.

I think I know the cause now, and it isn't anything specific to /dev/ptmx
it would affect all cloning devices.

In kern/vfs_syscalls.c do_open() we see the following:

if (vp == NULL) {
fd_abort(p, fp, indx);
error = fd_dupopen(dupfd, dupfd_move, flags, );
if (error)
return error;
*fd = indx;   
} else {
error = open_setfp(l, fp, vp, indx, flags);
if (error)
return error;
VOP_UNLOCK(vp);
*fd = indx;
fd_affix(p, fp, indx);
}

The vp==NULL case is where cloning devices are handled.   fd_dupopen()
arranges top make the duplicate happen.

It attempts to handle O_CLOEXEC thus:

error = fd_dup(fp, 0, newp, ff->ff_exclose);

where ff_exclose is the close_on_exec bit for the old file descriptor
(the one being duplicated) which is just fine for other calls of fd_dupopen().

The vp!=NULL case above is for "normal" opens, and open_setfp() is what
handles O_CLOEXEC - that's what (eventually) makes ff_exclose be true
(it also handles O_EXLOCK and O_SHLOCK.

None of that is being done in the cloning device open case, so those 3
flags simply cannot work, as the code is written currently, for these
devices.

I'm not about to go playing in this very fiddly piece of code, but I'm
kind of hoping that dholland@ might know the right magic sequence to make
this issue go away.

It probably means a call to open_setfp() with some parameters or other,
somewhere in the vp==NULL side of that if statement, though, just what
I have not attempted to work out.

kre



Re: CVS commit: src/lib/libc/stdlib

2022-09-29 Thread Christos Zoulas
Yes, I had forgotten about the need to do this explicitly...

christos

> On Sep 28, 2022, at 10:23 PM, Robert Elz  wrote:
> 
> Apologies, I did not read the code closely enough, there must be a bug
> in the way the clone file descriptor is created.
> 
> kre



signature.asc
Description: Message signed with OpenPGP


Re: CVS commit: src/lib/libc/stdlib

2022-09-28 Thread Robert Elz
Apologies, I did not read the code closely enough, there must be a bug
in the way the clone file descriptor is created.

kre


Re: CVS commit: src/lib/libc/stdlib

2022-09-28 Thread Robert Elz
Date:Wed, 28 Sep 2022 20:48:53 -0400
From:"David H. Gutteridge" 
Message-ID:  <9c9c8e9d9384338320b47dabfdc94...@gutteridge.ca>

   
  | (O_CLOEXEC, on the other hand, is ignored, at the moment.)

No it isn't, your test is faulty, O_CLOEXEC is a different
kind of flag, applies at a different level, and is fetched
a different way.

That's what dholland@ tried to tell you a few days ago.

kre
  |
  | $ cat open_test.c
  | #include 
  | #include 
  | #include 
  |
  | int main(int argc, char* argv[])
  | {
  | int fd, flags;
  |
  | printf("Regular file (read-only) with O_NONBLOCK | O_CLOEXEC.\n");
  | if ((fd = open("/etc/release", O_RDONLY | O_NONBLOCK | O_CLOEXEC)) < 0)
  | printf("Failed to get file handle.\n");
  | else {
  | printf("Descriptor flags: %d\n", flags = fcntl(fd, F_GETFD));
  | printf("Status flags: %d\n", flags = fcntl(fd, F_GETFL, 0));
  | close(fd);
  | }
  |
  | printf("POSIX pt cloning device with O_NONBLOCK | O_CLOEXEC.\n");
  | if ((fd = open("/dev/ptmx", O_RDWR | O_NOCTTY | O_NONBLOCK | 
  | O_CLOEXEC)) < 0)
  | printf("Failed to open /dev/ptmx.\n");
  | else {
  | printf("Descriptor flags: %d\n", flags = fcntl(fd, F_GETFD));
  | printf("Status flags: %d\n", flags = fcntl(fd, F_GETFL, 0));
  | close(fd);
  | }
  |
  | return 0;
  | }
  |
  | $ ./open_test
  | Regular file (read-only) with O_NONBLOCK | O_CLOEXEC.
  | Descriptor flags: 1
  | Status flags: 4
  | POSIX pt cloning device with O_NONBLOCK | O_CLOEXEC.
  | Descriptor flags: 0
  | Status flags: 6
  |
  | Regards,
  |
  | Dave
  |


Re: CVS commit: src/lib/libc/stdlib

2022-09-28 Thread David H. Gutteridge

On Wed, 28 Sep 2022, at 15:07:41 - (UTC), Christos Zoulas wrote:

In article <20220928003547.D2375FA92%cvs.NetBSD.org@localhost>,
David H. Gutteridge  wrote:

-=-=-=-=-=-

Module Name:src
Committed By:   gutteridge
Date:   Wed Sep 28 00:35:47 UTC 2022

Modified Files:
src/lib/libc/stdlib: posix_openpt.3

Log Message:
posix_openpt.3: reflect flag changes from r. 1.44 of tty_ptm.c

Some flags are now accepted, others are still ignored. (E.g., other
BSDs would return EINVAL if O_RDWR wasn't passed, and we now accept
O_NONBLOCK but not O_CLOEXEC.)


How so?

#define FCNTLFLAGS  
(FAPPEND|FASYNC|FFSYNC|FNONBLOCK|FDSYNC|FRSYNC|FALTIO|\

  ^
FDIRECT|FNOSIGPIPE)
/* bits to save after open(2) */
#define FMASK   (FREAD|FWRITE|FCNTLFLAGS|FEXEC)
 ^^

Best,

christos


Hi Christos,

I'm sorry, I don't follow your question? I wrote "we now accept
O_NONBLOCK...", which is what I'm reading you're saying too?

(O_CLOEXEC, on the other hand, is ignored, at the moment.)

$ cat open_test.c
#include 
#include 
#include 

int main(int argc, char* argv[])
{
int fd, flags;

printf("Regular file (read-only) with O_NONBLOCK | O_CLOEXEC.\n");
if ((fd = open("/etc/release", O_RDONLY | O_NONBLOCK | O_CLOEXEC)) < 0)
printf("Failed to get file handle.\n");
else {
printf("Descriptor flags: %d\n", flags = fcntl(fd, F_GETFD));
printf("Status flags: %d\n", flags = fcntl(fd, F_GETFL, 0));
close(fd);
}

printf("POSIX pt cloning device with O_NONBLOCK | O_CLOEXEC.\n");
	if ((fd = open("/dev/ptmx", O_RDWR | O_NOCTTY | O_NONBLOCK | 
O_CLOEXEC)) < 0)

printf("Failed to open /dev/ptmx.\n");
else {
printf("Descriptor flags: %d\n", flags = fcntl(fd, F_GETFD));
printf("Status flags: %d\n", flags = fcntl(fd, F_GETFL, 0));
close(fd);
}

return 0;
}

$ ./open_test
Regular file (read-only) with O_NONBLOCK | O_CLOEXEC.
Descriptor flags: 1
Status flags: 4
POSIX pt cloning device with O_NONBLOCK | O_CLOEXEC.
Descriptor flags: 0
Status flags: 6

Regards,

Dave


Re: CVS commit: src/lib/libc/stdlib

2022-09-28 Thread Christos Zoulas
In article <20220928003547.d2375f...@cvs.netbsd.org>,
David H. Gutteridge  wrote:
>-=-=-=-=-=-
>
>Module Name:   src
>Committed By:  gutteridge
>Date:  Wed Sep 28 00:35:47 UTC 2022
>
>Modified Files:
>   src/lib/libc/stdlib: posix_openpt.3
>
>Log Message:
>posix_openpt.3: reflect flag changes from r. 1.44 of tty_ptm.c
>
>Some flags are now accepted, others are still ignored. (E.g., other
>BSDs would return EINVAL if O_RDWR wasn't passed, and we now accept
>O_NONBLOCK but not O_CLOEXEC.)

How so?

#define FCNTLFLAGS  (FAPPEND|FASYNC|FFSYNC|FNONBLOCK|FDSYNC|FRSYNC|FALTIO|\
   ^
 FDIRECT|FNOSIGPIPE)
/* bits to save after open(2) */
#define FMASK   (FREAD|FWRITE|FCNTLFLAGS|FEXEC)
  ^^

Best,

christos



Re: null-terminated vs. nul-terminated (was: Re: CVS commit: src/lib/libc/gen)

2022-03-26 Thread Jason Thorpe


> On Mar 26, 2022, at 9:39 AM, Taylor R Campbell 
>  wrote:
> 
> `C string' is ambiguous because there are also char arrays that
> function as strings but which are not guaranteed to be NUL-terminated,
> as strncpy is intended for.

A non-terminated char array is not a C-string.  The term C-string is not 
ambiguous.  This is something that, amazingly, even Internet trolls appear to 
agree on.  However, they do disagree as to the spelling of the terminating 
character's name, which is why I think it's best to elide it altogether.

-- thorpej



Re: null-terminated vs. nul-terminated (was: Re: CVS commit: src/lib/libc/gen)

2022-03-26 Thread Jason Thorpe


> On Mar 26, 2022, at 9:09 AM, Warner Losh  wrote:
> 
> Since all the 'C' standards[*] use "null-terminated" and "null character", 
> it's likely best to use that terminology because there is a source of truth 
> for its definition in case of ambiguity or doubt.

Ah, but you're giving up the opportunity to use indirection to solve the 
problem.  By calling it a "C-string", then those who care what the standard 
calls the terminating character can go look it up! :-)

-- thorpej



Re: null-terminated vs. nul-terminated (was: Re: CVS commit: src/lib/libc/gen)

2022-03-26 Thread Warner Losh
On Sat, Mar 26, 2022 at 9:53 AM Roland Illig  wrote:

> Am 24.03.2022 um 02:55 schrieb David H. Gutteridge:
> > Module Name:  src
> > Committed By: gutteridge
> > Date: Thu Mar 24 01:55:15 UTC 2022
> >
> > Modified Files:
> >   src/lib/libc/gen: popen.3
> >
> > Log Message:
> > popen.3: minor spelling, grammar, style, and xref tweaks
> >
> >
> > To generate a diff of this commit:
> > cvs rdiff -u -r1.22 -r1.23 src/lib/libc/gen/popen.3
>
> The term "null-terminated string" is quite common when talking about C.
>   In contrast, the word "nul" in "nul-terminated" always reminds me of
> the character abbreviation in ASCII, which has a narrower scope than C.
>   I prefer to keep "null-terminated" here.
>

The standard uses "null-terminated" and "null character" (see Character
Sets section 5.2.1 (from the C2x draft, but this term dates back to C89):
"A byte with all bits set to 0, called the null character, shall exist in
the basic execution character set; it is used to terminate a character
string."
I couldn't find the definition for null-terminated though. This is
different than the NULL #define

Not to be confused with the all zeros ASCII charater, whose mnemonic is
NUL, which is where some pressure to use NUL terminated comes from. I agree
that it's usage is narrower and really only relevant for certain ASCII and
ASCII-derived character sets, which is why the standard chose the spelling
it did.

Since all the 'C' standards[*] use "null-terminated" and "null character",
it's likely best to use that terminology because there is a source of truth
for its definition in case of ambiguity or doubt.

Warner

[*] I've not gone the extra mile and checked to see if K used this
phrase, to be honest.


Re: null-terminated vs. nul-terminated (was: Re: CVS commit: src/lib/libc/gen)

2022-03-26 Thread Taylor R Campbell
> Date: Sat, 26 Mar 2022 16:53:19 +0100
> From: Roland Illig 
> 
> The term "null-terminated string" is quite common when talking about C.
>   In contrast, the word "nul" in "nul-terminated" always reminds me of
> the character abbreviation in ASCII, which has a narrower scope than C.
>   I prefer to keep "null-terminated" here.

I feel like I've usually seen it as NUL-terminated.  I thought it was
in /usr/share/misc/style but I must have been thinking of a different
style guide.

`NUL' is better than `null' or `NULL' here because it's not a null
pointer, unlike, e.g., the execve argv terminator.  Even if the string
isn't US-ASCII, what character encoding calls a nonzero byte `NUL'?

`NUL' is better than `zero' or `0' here because it's unambiguously the
all-bits-zero byte, not the US-ASCII encoding of `0' (i.e., decimal 48
or 0x30).

`C string' is ambiguous because there are also char arrays that
function as strings but which are not guaranteed to be NUL-terminated,
as strncpy is intended for.


Re: null-terminated vs. nul-terminated (was: Re: CVS commit: src/lib/libc/gen)

2022-03-26 Thread Jason Thorpe


> On Mar 26, 2022, at 9:17 AM, Martin Husemann  wrote:
> When talking about it I prefer "zero terminated", or C-string, in
> contrast to C++ std::string (which are objects) or Pascal strings
> (which have an explicit length at the beginning).

Yes, I also prefer the term “C-string"

-- thorpej







Re: null-terminated vs. nul-terminated (was: Re: CVS commit: src/lib/libc/gen)

2022-03-26 Thread Martin Husemann
On Sat, Mar 26, 2022 at 04:53:19PM +0100, Roland Illig wrote:
> The term "null-terminated string" is quite common when talking about C.

NULL terminated lists/array are quite common, but NULL is a pointer and
the string is terminated by a 0 char (sometimes spelled as \0 in a string
literal, but implicitly added by the compiler at the end of a literal,
and spelled as NUL in the ascii table).

>  I prefer to keep "null-terminated" here.

I think it is a bug.

When talking about it I prefer "zero terminated", or C-string, in
contrast to C++ std::string (which are objects) or Pascal strings
(which have an explicit length at the beginning).

Martin


null-terminated vs. nul-terminated (was: Re: CVS commit: src/lib/libc/gen)

2022-03-26 Thread Roland Illig

Am 24.03.2022 um 02:55 schrieb David H. Gutteridge:

Module Name:src
Committed By:   gutteridge
Date:   Thu Mar 24 01:55:15 UTC 2022

Modified Files:
src/lib/libc/gen: popen.3

Log Message:
popen.3: minor spelling, grammar, style, and xref tweaks


To generate a diff of this commit:
cvs rdiff -u -r1.22 -r1.23 src/lib/libc/gen/popen.3


The term "null-terminated string" is quite common when talking about C.
 In contrast, the word "nul" in "nul-terminated" always reminds me of
the character abbreviation in ASCII, which has a narrower scope than C.
 I prefer to keep "null-terminated" here.

Roland


Re: CVS commit: src/lib/libc/time

2022-03-26 Thread Christos Zoulas
In article <977b81a4-d330-6722-7ce4-cc4e61011...@gmx.de>,
Roland Illig   wrote:
>Am 25.03.2022 um 22:25 schrieb Christos Zoulas:
>> In article <20220325183551.0f039f...@cvs.netbsd.org>,
>> Roland Illig  wrote:
>>> -=-=-=-=-=-
>>>
>>> Module Name:src
>>> Committed By:   rillig
>>> Date:   Fri Mar 25 18:35:50 UTC 2022
>>>
>>> Modified Files:
>>> src/lib/libc/time: localtime.c
>>>
>>> Log Message:
>>> localtime.c: add back storage class 'register'
>>>
>>> This reduces the differences to the upstream code.
>>
>> I don't know why you did that. It is as simple to sed -e 's/register //g'
>> in upstream when you diff. Adding register is useless; these days it is
>> mostly ignored by compilers, except you can't &...
>
>I thought that having a small diff to the upstream code was more
>important than a few saved keywords.  If I was wrong, I can of course
>revert it.

I don't care too much, less clutter without obsolete keywords that do
nothing or worse.

christos



Re: CVS commit: src/lib/libc/time

2022-03-25 Thread Roland Illig

Am 26.03.2022 um 00:50 schrieb Tobias Nygren:

On Sat, 26 Mar 2022 00:31:45 +0100
Roland Illig  wrote:


localtime.c: add back storage class 'register'

This reduces the differences to the upstream code.


I don't know why you did that. It is as simple to sed -e 's/register //g'
in upstream when you diff. Adding register is useless; these days it is
mostly ignored by compilers, except you can't &...


I thought that having a small diff to the upstream code was more
important than a few saved keywords.  If I was wrong, I can of course
revert it.


Can we keep it as-is, but #define register to empty at the top of the file?


I don't think the compiler's view on the code is the main point of this
discussion.  At least, GCC generates the same code on x86_64, with or
without 'register', so it doesn't seem to care.

To me, the more interesting question is how human readers should deal
with the code.

If this piece of code had originated in the NetBSD tree, I wouldn't have
added the 'register' keywords, I would have just indented the code
according to KNF.

But, most of this code is copied from the tz project, therefore our goal
should be to make the diff minimal, in order to benefit from the testing
of other projects using the same code.  In this case, it means to keep
the indentation exactly as inconsistent as upstream and also to keep the
keyword 'register'.  The remaining diff is still large enough to consume
some developer time, containing a few fixes for WARNS=6 and for the lint
checks.

Roland


Re: CVS commit: src/lib/libc/time

2022-03-25 Thread Tobias Nygren
On Sat, 26 Mar 2022 00:31:45 +0100
Roland Illig  wrote:

> >> localtime.c: add back storage class 'register'
> >>
> >> This reduces the differences to the upstream code.
> >
> > I don't know why you did that. It is as simple to sed -e 's/register //g'
> > in upstream when you diff. Adding register is useless; these days it is
> > mostly ignored by compilers, except you can't &...
> 
> I thought that having a small diff to the upstream code was more
> important than a few saved keywords.  If I was wrong, I can of course
> revert it.

Can we keep it as-is, but #define register to empty at the top of the file?



Re: CVS commit: src/lib/libc/time

2022-03-25 Thread Roland Illig

Am 25.03.2022 um 22:25 schrieb Christos Zoulas:

In article <20220325183551.0f039f...@cvs.netbsd.org>,
Roland Illig  wrote:

-=-=-=-=-=-

Module Name:src
Committed By:   rillig
Date:   Fri Mar 25 18:35:50 UTC 2022

Modified Files:
src/lib/libc/time: localtime.c

Log Message:
localtime.c: add back storage class 'register'

This reduces the differences to the upstream code.


I don't know why you did that. It is as simple to sed -e 's/register //g'
in upstream when you diff. Adding register is useless; these days it is
mostly ignored by compilers, except you can't &...


I thought that having a small diff to the upstream code was more
important than a few saved keywords.  If I was wrong, I can of course
revert it.

Roland


Re: CVS commit: src/lib/libc/time

2022-03-25 Thread Christos Zoulas
In article ,
Roland Illig   wrote:
>
>The mess with the indentation comes from upstream.  In our copy it's a
>bit worth than upstream, but not much.

I've asked upstream to indent consistently in the past. It has not happened.
I try to minimize the diffs with upstream so that we can keep patching in
our changes. I've also submitted our changes for inclusion to upstream.
Some have been accepted, others have not.

Best,

christos



Re: CVS commit: src/lib/libc/time

2022-03-25 Thread Christos Zoulas
In article <20220325183551.0f039f...@cvs.netbsd.org>,
Roland Illig  wrote:
>-=-=-=-=-=-
>
>Module Name:   src
>Committed By:  rillig
>Date:  Fri Mar 25 18:35:50 UTC 2022
>
>Modified Files:
>   src/lib/libc/time: localtime.c
>
>Log Message:
>localtime.c: add back storage class 'register'
>
>This reduces the differences to the upstream code.

I don't know why you did that. It is as simple to sed -e 's/register //g'
in upstream when you diff. Adding register is useless; these days it is
mostly ignored by compilers, except you can't &...

christos



Re: CVS commit: src/lib/libc/time

2022-03-25 Thread Christos Zoulas
In article <20220325190016.15114f...@cvs.netbsd.org>,
Roland Illig  wrote:
>-=-=-=-=-=-
>
>Module Name:   src
>Committed By:  rillig
>Date:  Fri Mar 25 19:00:16 UTC 2022
>
>Modified Files:
>   src/lib/libc/time: localtime.c
>
>Log Message:
>localtime.c: take indentation style from upstream

That I am fine with, but remember that some of the indentation changes
in upstream were gratuitus (no code changes).

christos



Re: CVS commit: src/lib/libc/time

2022-03-25 Thread Roland Illig

Am 25.03.2022 um 13:59 schrieb Robert Elz:

 Date:Thu, 24 Mar 2022 23:32:30 +0100
 From:Roland Illig 
 Message-ID:  <6bb00924-edaf-a4c8-348e-ba1304d57...@gmx.de>

   | Someone should clean up this mess.

No, they probabky shouldn't, in general.

That source comes from the tz project (currently from
tzcode2022a) with local modifications.


When I said "somebody" above, I really meant "somebody at the
appropriate position".  I already suspected that this would not be
NetBSD.  Sorry for the brevity.


I think upstream mostly uses ts=4 (or more likely
the emacs equivalent).   we mostky use tab, and view
with ts=8 and yes, that can mead to ugly results.


The mess with the indentation comes from upstream.  In our copy it's a
bit worth than upstream, but not much.

Roland


Re: CVS commit: src/lib/libc/time

2022-03-25 Thread Robert Elz
Date:Thu, 24 Mar 2022 23:32:30 +0100
From:Roland Illig 
Message-ID:  <6bb00924-edaf-a4c8-348e-ba1304d57...@gmx.de>

  | Someone should clean up this mess.

No, they probabky shouldn't, in general.

That source comes from the tz project (currently from
tzcode2022a) with local modifications.

I think upstream mostly uses ts=4 (or more likely
the emacs equivalent).   we mostky use tab, and view
with ts=8 and yes, that can mead to ugly results.

But to ease merging of future versions, this should be
touched as little as possible.

kre

ps: one way and another we have plenty of localtime
testing, not all in tests/libc.  At least for general
correctness.  I don't know if all the various hard casez
are as well tested locally, but changes that would
affect thise generally come from upstream.   Lots of
people do lots of tests on the upstream code.


Re: CVS commit: src/lib/libc/time

2022-03-24 Thread Roland Illig

Am 24.03.2022 um 17:15 schrieb Christos Zoulas:

Module Name:src
Committed By:   christos
Date:   Thu Mar 24 16:15:05 UTC 2022

Modified Files:
src/lib/libc/time: localtime.c

Log Message:
put back the 2022a changes and fix the misplaced brace.


To generate a diff of this commit:
cvs rdiff -u -r1.128 -r1.129 src/lib/libc/time/localtime.c


The indentation of this file looks completely random:

Line 272 has a comment indented by 2 spaces, the code in that block is
indented by 4 spaces instead of the usual tab.  Same for line 295.

Line 350 is indented differently than line 346.

Line 420 has a continuation that is indented by 2 spaces instead of the
usual 4.

Line 490 is indented by 2 spaces instead of the usual tab.

Line 513 is indented by 4 spaces instead of the usual tab.

Line 534 is indented by 2 spaces instead of the usual tab.

Line 585 is indented by 2 spaces instead of the usual tab.

Line 667 is indented by 2 tabs instead of the usual 1 tab.

Line 683 is indented by 2 tabs instead of the usual tab.

Line 688 is indented by 2 spaces instead of the usual tab.

Line 1083 contains unaligned variable declarations.

Line 2347 looks as if it were assuming a tab width of 4, but the
immediately following line breaks this assumption.


Someone should clean up this mess.  I don't know how good the test
coverage for localtime(3) is, so I'm happy to leave this (and a few
possible refactorings to decrease the indentation) to people more
familiar with the history and origins of this file.

Roland


Re: CVS commit: src/lib/libc/time

2022-03-23 Thread Martin Husemann
On Wed, Mar 23, 2022 at 02:00:29PM +0100, Tobias Nygren wrote:
> I think this commit broke something.
> date(1) no longer reports a time zone and is stuck at GMT time:
> 
> $ ls -l /etc/localtime
> lrwxr-xr-x  1 root  wheel  36 Mar 22 08:47 /etc/localtime -> 
> /usr/share/zoneinfo/Europe/Stockholm
> $ date
> Wed Mar 23 12:57:40  2022
> $ TZ=CET date
> Wed Mar 23 12:57:44  2022
> $ TZ=foo date
> Wed Mar 23 12:57:48 GMT 2022

The lib/libc/time tests are failing and e.g. this:

zdump -v /usr/share/zoneinfo/US/Eastern

gives a suprisingly short output.

Martin


Re: CVS commit: src/lib/libc/time

2022-03-23 Thread Tobias Nygren
On Tue, 22 Mar 2022 13:48:39 -0400
Christos Zoulas  wrote:

> Modified Files:
>   src/lib/libc/time: CONTRIBUTING Makefile NEWS localtime.c private.h
>   theory.html tz-link.html tzselect.8 tzselect.ksh version zdump.c
>   zic.c
> 
> Log Message:
> welcome to tzcode-2022a

Hi,

I think this commit broke something.
date(1) no longer reports a time zone and is stuck at GMT time:

$ ls -l /etc/localtime
lrwxr-xr-x  1 root  wheel  36 Mar 22 08:47 /etc/localtime -> 
/usr/share/zoneinfo/Europe/Stockholm
$ date
Wed Mar 23 12:57:40  2022
$ TZ=CET date
Wed Mar 23 12:57:44  2022
$ TZ=foo date
Wed Mar 23 12:57:48 GMT 2022

Kind regards,
-Tobias


Re: CVS commit: src/lib/libc/stdlib

2022-03-12 Thread Tobias Nygren
On Sat, 12 Mar 2022 08:26:01 +
Nia Alarie  wrote:

> Module Name:  src
> Committed By: nia
> Date: Sat Mar 12 08:26:01 UTC 2022
> 
> Modified Files:
>   src/lib/libc/stdlib: hcreate.c
> 
> Log Message:
> hcreate(3): use reallocarr instead of malloc(x * y)

Caution: malloc(0) and reallocarr(, 0, s) have different semantics
for the returned pointer value.



Re: CVS commit: src/lib/libc/stdlib

2022-02-12 Thread Warner Losh
On Sat, Feb 12, 2022, 11:52 AM Taylor R Campbell <
campbell+netbsd-source-change...@mumble.net> wrote:

> > Date: Sun, 13 Feb 2022 05:44:38 +1100
> > from: matthew green 
> >
> > "Roland Illig" writes:
> > > Module Name:src
> > > Committed By:   rillig
> > > Date:   Fri Feb 11 21:36:46 UTC 2022
> > >
> > > Modified Files:
> > > src/lib/libc/stdlib: getenv.c
> > >
> > > Log Message:
> > > libc/getenv: remove trailing whitespace
> > >
> > > No binary change.
> >
> > please avoid purely whitespace changes unless you're
> > also going to be modifying the code itself.
> >
> > (don't back out now.)
>
> I thought we were supposed to _avoid_ mixing whitespace changes and
> functional changes?
>
> (Obviously the solution is to just only ever commit code with perfect
> style to begin with so this is never an issue!  Speaking of which,
> (setq show-trailing-whitespace t) is helpful to avoid this kind of
> mistake up front.  git diff or git add -p will also flag this kind of
> mistake, if you're working in git.  /usr/share/misc/NetBSD.el is also
> helpful for other KNF style.)
>

I thought the rule was don't make purely whitespace changes UNLESS you plan
on making other changes to (or are fixing an oops you just made). If that's
the case, separate the two commits.

Warner

>


Re: CVS commit: src/lib/libc/stdlib

2022-02-12 Thread Roland Illig

Am 12.02.2022 um 20:10 schrieb Warner Losh:

I thought the rule was don't make purely whitespace changes UNLESS you
plan on making other changes to (or are fixing an oops you just made).
If that's the case, separate the two commits.


That sounds like a useful rule, I'll stick to it.

Roland


Re: CVS commit: src/lib/libc/stdlib

2022-02-12 Thread Taylor R Campbell
> Date: Sun, 13 Feb 2022 05:44:38 +1100
> from: matthew green 
> 
> "Roland Illig" writes:
> > Module Name:src
> > Committed By:   rillig
> > Date:   Fri Feb 11 21:36:46 UTC 2022
> >
> > Modified Files:
> > src/lib/libc/stdlib: getenv.c
> >
> > Log Message:
> > libc/getenv: remove trailing whitespace
> >
> > No binary change.
> 
> please avoid purely whitespace changes unless you're
> also going to be modifying the code itself.
> 
> (don't back out now.)

I thought we were supposed to _avoid_ mixing whitespace changes and
functional changes?

(Obviously the solution is to just only ever commit code with perfect
style to begin with so this is never an issue!  Speaking of which,
(setq show-trailing-whitespace t) is helpful to avoid this kind of
mistake up front.  git diff or git add -p will also flag this kind of
mistake, if you're working in git.  /usr/share/misc/NetBSD.el is also
helpful for other KNF style.)


re: CVS commit: src/lib/libc/stdlib

2022-02-12 Thread matthew green
"Roland Illig" writes:
> Module Name:  src
> Committed By: rillig
> Date: Fri Feb 11 21:36:46 UTC 2022
>
> Modified Files:
>   src/lib/libc/stdlib: getenv.c
>
> Log Message:
> libc/getenv: remove trailing whitespace
>
> No binary change.

please avoid purely whitespace changes unless you're
also going to be modifying the code itself.

(don't back out now.)

thanks.


.mrg.


Re: CVS commit: src/lib/libc/string

2021-10-29 Thread Joerg Sonnenberger
On Fri, Oct 29, 2021 at 10:11:57AM +, Nia Alarie wrote:
> Module Name:  src
> Committed By: nia
> Date: Fri Oct 29 10:11:57 UTC 2021
> 
> Modified Files:
>   src/lib/libc/string: wcsdup.c
> 
> Log Message:
> wcsdup(3): use reallocarr to catch integer overflow

Except that no such integer overflow can happen, since the input string
already has done the same computation.

Joerg


Re: CVS commit: src/lib/libc/arch/powerpc/string

2021-07-25 Thread Rin Okuyama

On 2021/07/25 6:32, Joerg Sonnenberger wrote:

On Sat, Jul 24, 2021 at 05:27:26AM +, Rin Okuyama wrote:

Module Name:src
Committed By:   rin
Date:   Sat Jul 24 05:27:26 UTC 2021

Modified Files:
src/lib/libc/arch/powerpc/string: Makefile.inc

Log Message:
For evbppc, use C version of bcopy(3), memcpy(3), memcmp(3), and
memmove(3) consistently for debug library (*.go) in order to avoid
alignment faults for 403.


Why do we want to pessimize all evbppc targets just for issues with the
403 design?


Well, for kernel, we can readily switch unaffected machines to
assembler versions.

For userland, we can provide something like libc_ua.so for machines
with capability of unaligned access. Some time, I will benchmark
whether it is worth the cost.

Thanks,
rin


Re: CVS commit: src/lib/libc/arch/powerpc/string

2021-07-24 Thread Joerg Sonnenberger
On Sat, Jul 24, 2021 at 05:27:26AM +, Rin Okuyama wrote:
> Module Name:  src
> Committed By: rin
> Date: Sat Jul 24 05:27:26 UTC 2021
> 
> Modified Files:
>   src/lib/libc/arch/powerpc/string: Makefile.inc
> 
> Log Message:
> For evbppc, use C version of bcopy(3), memcpy(3), memcmp(3), and
> memmove(3) consistently for debug library (*.go) in order to avoid
> alignment faults for 403.

Why do we want to pessimize all evbppc targets just for issues with the
403 design?

Joerg


Re: CVS commit: src/lib/libc/arch/arm

2021-06-29 Thread Rin Okuyama

Oops, this is broken. I will fix shortly...

rin

On 2021/06/30 8:29, Rin Okuyama wrote:

Module Name:src
Committed By:   rin
Date:   Tue Jun 29 23:29:12 UTC 2021

Modified Files:
src/lib/libc/arch/arm/gen: swapcontext.S
src/lib/libc/arch/arm/sys: __clone.S

Log Message:
Align sp to 8-byte boundary as required by EABI.

IIUC, this change only affects libc compiled for ``Thumb-mode userland'',
which we've not officially supported yet.


To generate a diff of this commit:
cvs rdiff -u -r1.15 -r1.16 src/lib/libc/arch/arm/gen/swapcontext.S
cvs rdiff -u -r1.9 -r1.10 src/lib/libc/arch/arm/sys/__clone.S

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.


Re: vfork() and posix_spawn() [was: Re: CVS commit: src/lib/libc/sys]

2021-06-14 Thread Robert Elz
Date:Mon, 14 Jun 2021 03:56:48 +0200
From:Joerg Sonnenberger 
Message-ID:  

  | This is even more true for multi-threaded applications
  | (where POSIX explicitly suggests that requirement).

Sure, anything with fork() and threads has issues, that's messy.
Even I know that, and I know very little about threads.

  | On the specific topic, I'm somewhat puzzled by the claim that fork is
  | async-signal-safe

That's not what I said, I said it isn't (though the way I phrased that
might not have been all that clear - I know I often type too much, and
sometimes overcompensate by typing too little).   But (from my message):

If the "that said" relates to fork() or vfork() not being async
signal safe, so a double fork() (when the first is vfork anyway)
would not be condoned,

was meant to acknowledge that fork() (and vfork()) aren't async signal
safe, and so if a requirement that async signal safe functions are all
that is permitted after vfork() actually existed, then a
if (vfork() == 0) fork();
sequence would not be permitted (not generate defined actions).

But for the purposes of whether double fork() option to posix_spawn
is useful or not, this restriction doesn't matter, as the above can be
(eventually) written
if (vfork() == 0) _Fork();
instead, as _Fork() is (to be) async signal safe. Same effect as the
(perhaps) undefined version above for this purpose.

  | since most implementations will require synchronisation for pthread_atfork

Yes, which is what _Fork() does not do (_Fork() is to fork() as _exit() is
to exit() - in a sense).

  | It most likely should. The main reason is that much of the system can
  | and often do depend on things like mutexes to ensure correctness and all
  | that is essentially UB after vfork().

Actually, it isn't UB, or shouldn't be.   The behaviour of vfork() is
actually very precisely specified (which doesn't take much, as it is
quite simple).   There's very little room for UB.   What there is is
plenty of room for simple errors to screw things (and appear as if it
is UB, when it is actually quite well defined broken code).   There's no
problem using mutexes after a vfork() with two caveats - first anything locked
must be unlocked again before the eventual _exit() or exec() (the mutex
must be returned to its state at the time of the vfork()) and the
child must not need to lock anything which might be already locked
in the parent (as that's a guaranteed deadlock).   How one makes the
latter condition work is for someone else to solve...

That said, most of what would require a mutex isn't something that should
be being done after a vfork() anyway - most of those operations are likely
to be things which change state of the process, and that's what (at least
without careful preparation) cannot happen in a child of a vfork() (so no
use of stdio, no use of malloc, ...)

  | That's even ignoring the stronger issue of mutating state.

Yes, though if it is intentional, that can actually be OK (sometimes).
Our /bin/sh relies upon the ability to modify its parent's state after
a vfork() to communicate status back to the parent (more than is possible
reliably with exit status - as the parent cannot tell from that whether
the status it receives is from its own image running the child after the
vfork() or from some other process after an exec() succeeded).
A modified volatile variable (in the parent), however, can only have
been modified by the vfork() child (we assume here that the parent
isn't modifying the variable itself elsewhere, naturally).

And yes, that means that on any system that "implements" vfork() using
cc -Dvfork=fork (or its equivalent) cannot use vfork() with our sh - we
need a "real" vfork, or must simply disable its use and only use fork
(which is what the -D implementation does anyway of course, just badly).

  | vfork use really should die...

That sentiment has been around for a long time - almost since vfork()
was first created (40+ years ago).   But it is still here.   Implementing
fork() using CoW was supposed to solve all the issues with fork().  It
didn't, vfork() is still lots faster.

posix_spawn() might allow some uses of vfork()+exec() to be retired, that
would be good, but it isn't going to get all of them.

  | No, it relates to one common pattern for used by or for daemon.

Yes, I understood that, but why do we care?   Daemons start how often?
What percentage of the forks() (vfork is, in practice, never used for
this) on your systems are generated by daemons starting?   What kind of
saving would you expect to see from allowing posix_spawn() to replace
that fork();fork() sequence ?   Is it really worth even the tiniest
extra complexity in that already fairly complex interface to handle
this almost irrelevant issue?   Do you believe in non-memory managed
real time systems there are any daemon processes like that at all?

  | There are non-trivial uses of fork, yes. There are much less 

Re: vfork() and posix_spawn() [was: Re: CVS commit: src/lib/libc/sys]

2021-06-13 Thread Joerg Sonnenberger
On Mon, Jun 14, 2021 at 01:33:51AM +0700, Robert Elz wrote:
> Date:Sat, 12 Jun 2021 23:13:54 +0200
> From:Joerg Sonnenberger 
> Message-ID:  
> 
> Sorry, missed this message when I was cherry picking messages to read in
> a timely fashion.
> 
>   | On Wed, Jun 09, 2021 at 01:03:20AM +0700, Robert Elz wrote:
>   | > after a vfork() the child can do anything
> 
>   | This is not true. After vfork(), it is only supposed to call async
>   | signal safe functions.
> 
> What you said, and what I said are not contradictory.   Note I said "can
> do" and you said "is only supposed to".
> 
> What is supposed to work and what actually does work (or can be made to
> work) are two different things.

Sure, but the set of what works reliably has actually been shrinking
over time and more bugs in various programs using vfork specifically
have been discovered. This is even more true for multi-threaded
applications (where POSIX explicitly suggests that requirement). On the
specific topic, I'm somewhat puzzled by the claim that fork is
async-signal-safe since most implementations will require
synchronisation for pthread_atfork and that seems to place quite a
burden on the implementation...

> I would note though our man page doesn't make that requirement, or not
> that I can see anyway, and vfork() is not in POSIX of course - and while
> I don't have a copy to check, I'd suspect not in the C standards either),
> so while that sounds like a reasonable requirement to impose for safer
> operation, I'm not sure that anyone or anything actually does that.

It most likely should. The main reason is that much of the system can
and often do depend on things like mutexes to ensure correctness and all
that is essentially UB after vfork(). That's even ignoring the stronger
issue of mutating state. vfork use really should die...

> 
>   | That said, a flag for the double fork semantic might be useful.
> 
> If the "that said" relates to fork() or vfork() not being async signal safe,
> so a double fork() (when the first is vfork anyway) would not be condoned,
> that doesn't matter, there to be an async-signal-safe _Fork() call added to
> the next version of POSIX, so even with the "only async signal safe"
> requirement for vfork() a:

No, it relates to one common pattern for used by or for daemon.

> Please don't see posix_spawn() as being (or ever becoming) a panacea that
> can replace all fork() (or even just vfork()) calls - even just in the
> cases where an exec() follows soon after.   It works fine for most simple
> cases, but there are lots of not-so-simple situations that it cannot handle,
> and burdening the interface with the ability to deal with all of those
> would reduce the "efficiently implementable" goal for little real benefit.

There are non-trivial uses of fork, yes. There are much less non-trivial
uses of vfork as the latter already has quite a few problems with spooky
actions otherwise. Supporting something like a double fork flag has very
little impact on the complexity of the implementation and even less
impact on the efficiency. We certainly are at the point where we can
start analyzing the remaining blockers for (v)fork+exec users.

> Another example is made obvious by the bug Martin committed a fix for
> in the past few days ... the order of operations was incorrect in our
> implementation.  But that this problem can exist shows that there
> are ordering issues - a process that wants to open files using its
> current privs, then reset those before (perhaps) opening more files,
> and then doing the exec() part cannot do that with posix_spawn().
> Again, this is rare enough that adding the complexity required to allow
> it to work just isn't worth it.   [In this area also note that the
> POSIX_SPAWN_RESETIDS flag causes (effectively) setuid(getuid()), there's
> no similar operation to do setuid(geteuid()) ... again too rare to bother.]

I quite disagree here, actually. The design-level issue is that
POSIX_SPAWN_RESETIDS is a flag and not an action. This means it can't be
sequenced and that is the reason for the limitation. There is an obvious
parallel with the semantics of the chdir action here--that needs to be
that, an action and not just a flag. The separate concern is of course
that we need more testing for posix_spawn, but that is hopefully also
going to become better as side effect of the non-GSoC project.

Joerg


vfork() and posix_spawn() [was: Re: CVS commit: src/lib/libc/sys]

2021-06-13 Thread Robert Elz
Date:Sat, 12 Jun 2021 23:13:54 +0200
From:Joerg Sonnenberger 
Message-ID:  

Sorry, missed this message when I was cherry picking messages to read in
a timely fashion.

  | On Wed, Jun 09, 2021 at 01:03:20AM +0700, Robert Elz wrote:
  | > after a vfork() the child can do anything

  | This is not true. After vfork(), it is only supposed to call async
  | signal safe functions.

What you said, and what I said are not contradictory.   Note I said "can
do" and you said "is only supposed to".

What is supposed to work and what actually does work (or can be made to
work) are two different things.

I would note though our man page doesn't make that requirement, or not
that I can see anyway, and vfork() is not in POSIX of course - and while
I don't have a copy to check, I'd suspect not in the C standards either),
so while that sounds like a reasonable requirement to impose for safer
operation, I'm not sure that anyone or anything actually does that.

  | That said, a flag for the double fork semantic might be useful.

If the "that said" relates to fork() or vfork() not being async signal safe,
so a double fork() (when the first is vfork anyway) would not be condoned,
that doesn't matter, there to be an async-signal-safe _Fork() call added to
the next version of POSIX, so even with the "only async signal safe"
requirement for vfork() a:

if ((child = vfork()) == 0) {
if (_Fork())_exit(0);
/* do child stuff here */
} else {
if (child != -1) waitpid(child, , 0);
/* parent continues */
}

sequence is possible (will be possible, we don't have _Fork() yet)
and fully conforming, apart from vfork() not being in the standard.
The child of the vfork() only does _Fork() and _exit(), both of which
are permitted.  The grandchild process is created by a full fork() so
can do anything at all.

For what it is worth, the definition of _Fork() is (to be):

The _Fork() function shall be equivalent to fork(), except
that fork handlers established by means of the pthread_atfork()
function shall not be called and _Fork() shall be async-signal-safe.

Aside from the prototype line, which is just as you'd expect, that's it.

If (or when) we add it, we probably need to decide whether we need _VFork()
(or some similar name) as well.   Probably.

However, an extra posix_spawn attribute to handle double fork is almost
certainly not warranted, this kind of thing isn't all that common, and
adding the mechanism to the posix_spawn set of functions would really
just be bloat (there's also a proposal floating around to add the
ability to change a terminal's process group as a posix_spawn action --
that's required for shells to be able to use posix_spawn in general cases,
but that most likely won't go anywhere).

Remember what the standard says about posix_spawn() (in its rationale, this
is not strictly part of the standard)...

The posix_spawn( ) function and its close relation posix_spawnp( )
have been introduced to overcome the following perceived difficulties
with fork( ): the fork( ) function is difficult or impossible to
implement without swapping or dynamic address translation.
[...]
This view of the role of posix_spawn( ) and posix_spawnp( ) influenced
the design of their API. It does not attempt to provide the full
functionality of fork( )/exec in which arbitrary user-specified
operations of any sort are permitted between the creation of the
child process and the execution of the new process image; any attempt
to reach that level would need to provide a programming language as
parameters.
[...]
[ It lists some of the known problems with the posix_spawn interface ]
[...]
The posix_spawn( ) and posix_spawnp( ) functions do not have all the
power of fork( )/exec. This is to be expected. The fork( ) function
is a wonderfully powerful operation. We do not expect to duplicate
its functionality in a simple, fast function with no special hardware
requirements. It is worth noting that posix_spawn( ) and
posix_spawnp( ) are very similar to the process creation operations
on many operating systems that are not UNIX systems.

It then goes on to list the requirements they had for posix_spawn()...

The requirements for posix_spawn( ) and posix_spawnp( ) are:
� They must be implementable without an MMU or unusual hardware.
� They must be compatible with existing POSIX standards.

Additional goals are:
� They should be efficiently implementable.
� They should be able to replace at least 50% of typical
  executions of fork( ).
� A system with posix_spawn( ) and posix_spawnp( ) and without
  fork( ) should be useful, at least for realtime applications.
� A 

Re: CVS commit: src/lib/libc/sys

2021-06-12 Thread Joerg Sonnenberger
On Wed, Jun 09, 2021 at 01:03:20AM +0700, Robert Elz wrote:
> It should, when it is workable for the application, make things
> faster, while also avoiding the vfork() perils.   But only when
> it works -- after a vfork() the child can do anything (it should
> avoid touching its parent's state, but it can) posix_spawn() isn't
> nearly that general, it just handles the most common things apps
> are likely to want to do between the fork() and exec().

This is not true. After vfork(), it is only supposed to call async
signal safe functions. That said, a flag for the double fork semantic
might be useful.

Joerg


Re: CVS commit: src/lib/libc/sys

2021-06-08 Thread Robert Elz
Date:Tue, 8 Jun 2021 16:15:12 +
From:"Nia Alarie" 
Message-ID:  <20210608161512.1d7c3f...@cvs.netbsd.org>

  | vfork.2: recommend posix_spawn instead

You might want to reconsider the wording there, posix_spawn() is
an alternative to [v]fork() + exec*().   Not just vfork().

It should, when it is workable for the application, make things
faster, while also avoiding the vfork() perils.   But only when
it works -- after a vfork() the child can do anything (it should
avoid touching its parent's state, but it can) posix_spawn() isn't
nearly that general, it just handles the most common things apps
are likely to want to do between the fork() and exec().  Perhaps
most notably, it has no notion of (with err checking omitted):

if (fork()) { parent_code(/* do a waitpid() first */); }
else if (fork()) { exit(0); }
else { child_code() /* often ending with execX() */ }

when one wants to create an orphan (nothing that cannot be waited upon).

kre



Re: CVS commit: src/lib/libc/regex

2021-02-25 Thread Christos Zoulas
In article <5c9e716-7ec1-9c7d-a7cb-21f08946...@invisible.ca>,
Jared McNeill   wrote:
>Building tools on macOS:
>
>/Users/jmcneill/netbsd/git-src/tools/compat/../../lib/libc/regex/regcomp.c:1585:8:
> 
>error: implicit declaration of function 'reallocarray' is invalid
>   in C99 [-Werror,-Wimplicit-function-declaration]
> ncs = reallocarray(p->g->sets, p->g->ncsets + 1, sizeof(*ncs));
>   ^
>/Users/jmcneill/netbsd/git-src/tools/compat/../../lib/libc/regex/regcomp.c:1585:8:
> 
>note: did you mean 'reallocarr'?
>/Users/jmcneill/netbsd/git-src/tools/compat/compat_defs.h:556:5: note: 
>'reallocarr' declared here
>int reallocarr(void *, size_t, size_t);
> ^

Fixed, thanks!

christos



Re: CVS commit: src/lib/libc/regex

2021-02-25 Thread Jared McNeill

Building tools on macOS:

/Users/jmcneill/netbsd/git-src/tools/compat/../../lib/libc/regex/regcomp.c:1585:8: 
error: implicit declaration of function 'reallocarray' is invalid

  in C99 [-Werror,-Wimplicit-function-declaration]
ncs = reallocarray(p->g->sets, p->g->ncsets + 1, sizeof(*ncs));
  ^
/Users/jmcneill/netbsd/git-src/tools/compat/../../lib/libc/regex/regcomp.c:1585:8: 
note: did you mean 'reallocarr'?
/Users/jmcneill/netbsd/git-src/tools/compat/compat_defs.h:556:5: note: 
'reallocarr' declared here

int reallocarr(void *, size_t, size_t);
^


On Tue, 23 Feb 2021, Christos Zoulas wrote:


Module Name:src
Committed By:   christos
Date:   Tue Feb 23 22:14:59 UTC 2021

Modified Files:
src/lib/libc/regex: cname.h engine.c re_format.7 regcomp.c regerror.c
regex.3 regex2.h regexec.c regfree.c utils.h
Removed Files:
src/lib/libc/regex: cclass.h

Log Message:
sync with FreeBSD:
   - NLS support
   - GNU extensions
   - bug fixes


To generate a diff of this commit:
cvs rdiff -u -r1.7 -r0 src/lib/libc/regex/cclass.h
cvs rdiff -u -r1.7 -r1.8 src/lib/libc/regex/cname.h
cvs rdiff -u -r1.24 -r1.25 src/lib/libc/regex/engine.c
cvs rdiff -u -r1.12 -r1.13 src/lib/libc/regex/re_format.7
cvs rdiff -u -r1.38 -r1.39 src/lib/libc/regex/regcomp.c
cvs rdiff -u -r1.23 -r1.24 src/lib/libc/regex/regerror.c
cvs rdiff -u -r1.26 -r1.27 src/lib/libc/regex/regex.3
cvs rdiff -u -r1.13 -r1.14 src/lib/libc/regex/regex2.h
cvs rdiff -u -r1.22 -r1.23 src/lib/libc/regex/regexec.c
cvs rdiff -u -r1.15 -r1.16 src/lib/libc/regex/regfree.c
cvs rdiff -u -r1.6 -r1.7 src/lib/libc/regex/utils.h

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.




Re: CVS commit: src/lib/libc/gen

2021-02-17 Thread David Holland
On Wed, Feb 17, 2021 at 11:51:04PM +, David A. Holland wrote:
 > Also, Someone(TM) should check if POSIX permits this or if we ought to
 > improve the implementation.

Unsurprisingly, POSIX is silent. It just says "rewinddir shall also
cause the directory stream to refer to the current state of the
corresponding directory, as a call to opendir() would have done,"
basically the same text we had previously.

This could be read as _mandating_ reopening it, but that is almost
certainly not what they intended, so I think we're ok.

https://pubs.opengroup.org/onlinepubs/9699919799/functions/opendir.html
https://pubs.opengroup.org/onlinepubs/9699919799/functions/rewinddir.html

-- 
David A. Holland
dholl...@netbsd.org


Re: CVS commit: src/lib/libc/locale

2021-02-15 Thread Joerg Sonnenberger
On Mon, Feb 15, 2021 at 04:37:15PM +0100, Thomas Klausner wrote:
> Hi!
> 
> Thanks for the man pages.
> 
> There is none for uselocale(3), which is heavily referenced from
> these, nor do I find a prototype in /usr/include...
> so how does one use these? :)

We don't support uselocale. You use this functions by passing the
locale_t to the *_l functions directly.

Joerg


Re: CVS commit: src/lib/libc/locale

2021-02-15 Thread Thomas Klausner
Hi!

Thanks for the man pages.

There is none for uselocale(3), which is heavily referenced from
these, nor do I find a prototype in /usr/include...
so how does one use these? :)
 Thomas

On Mon, Feb 15, 2021 at 09:35:04AM -0500, Christos Zoulas wrote:
> Module Name:  src
> Committed By: christos
> Date: Mon Feb 15 14:35:04 UTC 2021
> 
> Modified Files:
>   src/lib/libc/locale: Makefile.inc
> Added Files:
>   src/lib/libc/locale: duplocale.3 freelocale.3 newlocale.3
> 
> Log Message:
> Add missing man pages (from FreeBSD)
> 
> 
> To generate a diff of this commit:
> cvs rdiff -u -r1.64 -r1.65 src/lib/libc/locale/Makefile.inc
> cvs rdiff -u -r0 -r1.1 src/lib/libc/locale/duplocale.3 \
> src/lib/libc/locale/freelocale.3 src/lib/libc/locale/newlocale.3
> 
> Please note that diffs are not public domain; they are subject to the
> copyright notices on the relevant files.
> 

> Modified files:
> 
> Index: src/lib/libc/locale/Makefile.inc
> diff -u src/lib/libc/locale/Makefile.inc:1.64 
> src/lib/libc/locale/Makefile.inc:1.65
> --- src/lib/libc/locale/Makefile.inc:1.64 Sun Aug 18 16:03:48 2013
> +++ src/lib/libc/locale/Makefile.inc  Mon Feb 15 09:35:04 2021
> @@ -1,5 +1,5 @@
>  #from: @(#)Makefile.inc  5.1 (Berkeley) 2/18/91
> -#$NetBSD: Makefile.inc,v 1.64 2013/08/18 20:03:48 joerg Exp $
> +#$NetBSD: Makefile.inc,v 1.65 2021/02/15 14:35:04 christos Exp $
>  
>  # locale sources
>  .PATH: ${ARCHDIR}/locale ${.CURDIR}/locale
> @@ -21,7 +21,7 @@ CPPFLAGS.runetable.c+=  -I${LIBCDIR}/cit
>  CPPFLAGS.multibyte_c90.c+=   -I${LIBCDIR}/citrus
>  CPPFLAGS.multibyte_amd1.c+=  -I${LIBCDIR}/citrus
>  
> -MAN+=setlocale.3 nl_langinfo.3
> +MAN+=duplocale.3 freelocale.3 newlocale.3 setlocale.3 nl_langinfo.3 
>  
>  MAN+=mbtowc.3 mbstowcs.3 wctomb.3 wcstombs.3 mblen.3 \
>  
> 
> Added files:
> 
> Index: src/lib/libc/locale/duplocale.3
> diff -u /dev/null src/lib/libc/locale/duplocale.3:1.1
> --- /dev/null Mon Feb 15 09:35:04 2021
> +++ src/lib/libc/locale/duplocale.3   Mon Feb 15 09:35:04 2021
> @@ -0,0 +1,80 @@
> +.\" $NetBSD: duplocale.3,v 1.1 2021/02/15 14:35:04 christos Exp $
> +.\" Copyright (c) 2011 The FreeBSD Foundation
> +.\" All rights reserved.
> +.\"
> +.\" This documentation was written by David Chisnall under sponsorship from
> +.\" the FreeBSD Foundation.
> +.\"
> +.\" Redistribution and use in source and binary forms, with or without
> +.\" modification, are permitted provided that the following conditions
> +.\" are met:
> +.\" 1. Redistributions of source code must retain the above copyright
> +.\"notice, this list of conditions and the following disclaimer.
> +.\" 2. Redistributions in binary form must reproduce the above copyright
> +.\"notice, this list of conditions and the following disclaimer in the
> +.\"documentation and/or other materials provided with the distribution.
> +.\"
> +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
> +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
> +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 
> PURPOSE
> +.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
> +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 
> CONSEQUENTIAL
> +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
> +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, 
> STRICT
> +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> +.\" SUCH DAMAGE.
> +.\"
> +.\" $FreeBSD: head/lib/libc/locale/duplocale.3 281925 2015-04-24 10:17:55Z 
> theraven $
> +.\"
> +.Dd September 17, 2011
> +.Dt DUPLOCALE 3
> +.Os
> +.Sh NAME
> +.Nm duplocale
> +.Nd duplicate an locale
> +.Sh LIBRARY
> +.Lb libc
> +.Sh SYNOPSIS
> +.In locale.h
> +.Ft locale_t
> +.Fn duplocale "locale_t locale"
> +.Sh DESCRIPTION
> +Duplicates an existing
> +.Fa locale_t
> +returning a new
> +.Fa locale_t
> +that refers to the same locale values but has an independent internal state.
> +Various functions, such as
> +.Xr mblen 3
> +require a persistent state.
> +These functions formerly used static variables and calls to them from 
> multiple
> +threads had undefined behavior.
> +They now use fields in the
> +.Fa locale_t
> +associated with the current thread by
> +.Xr uselocale 3 .
> +These calls are therefore only thread safe on threads with a unique 
> per-thread
> +locale.
> +The locale returned by this call must be freed with
> +.Xr freelocale 3 .
> +.Sh SEE ALSO
> +.Xr freelocale 3 ,
> +.Xr localeconv 3 ,
> +.Xr newlocale 3 ,
> +.\" .Xr querylocale 3 ,
> +.Xr uselocale 3 ,
> +.\" .Xr xlocale 3
> +.Sh STANDARDS
> +This function conforms to
> +.St -p1003.1-2008 .
> +.Sh BUGS
> +Ideally,
> +.Xr uselocale 3
> +should make a 

Re: CVS commit: src/lib/libc/stdio

2021-02-01 Thread Robert Elz
Date:Mon, 1 Feb 2021 17:50:54 +
From:"Jaromir Dolecek" 
Message-ID:  <20210201175054.112e7f...@cvs.netbsd.org>

  | FreeBSD has a similar check, but they return EINVAL instead, feel
  | free to adjust if SUS or other standard mandates specific value

Not currently (unless the C standard says something) - but EINVAL
would be a better choice, fread() incorporates all errors from
fgetc() (and fwrite() from fputc()) and there EOVERFLOW is used to
indicate that the file offset exceeds the maximum possible file
size, which is quite a different problem.   EINVAL is currently
unused (at least in POSIX) but makes more sense for this problem
(which is am invalid arg combination).

FWIW, I submitted a defect report on this issue with the Austin Group
(POSIX maintainers) yesterday, though I am expecting they'll simply
punt it to the C standards group.

kre



Re: CVS commit: src/lib/libc/stdio

2021-01-31 Thread Robert Elz
Date:Sun, 31 Jan 2021 18:34:22 +0100
From:Joerg Sonnenberger 
Message-ID:  

  | That makes no sense. Just turn them into a short read, which is
  | something users have to deal with anyway.

I'm not sure I agree with that one.   If the user's size * nmemb
overflows a size_t then the user is attempting something entirely
insane (they cannot possibly have a buffer that big).  Reading
anything at all is most probably the wrong thing to do, as the
args are very likely simply erroneous.   EINVAL is not an unreasonable
approach, I wouldn't even object to simply abort().

kre



Re: CVS commit: src/lib/libc/stdio

2021-01-31 Thread Kamil Rytarowski
On 31.01.2021 18:34, Joerg Sonnenberger wrote:
> On Sun, Jan 31, 2021 at 05:27:28PM +0100, Kamil Rytarowski wrote:
>> On 31.01.2021 17:18, Jaromir Dolecek wrote:
>>> Module Name:src
>>> Committed By:   jdolecek
>>> Date:   Sun Jan 31 16:18:22 UTC 2021
>>>
>>> Modified Files:
>>> src/lib/libc/stdio: fread.c
>>>
>>> Log Message:
>>> for unbuffered I/O arrange for the destination buffer to be filled in one
>>> go, instead of triggering long series of 1 byte read(2)s; this speeds up
>>> fread() several order of magnitudes for this case, directly proportional
>>> to the size of the supplied buffer
>>>
>>> change adapted from OpenBSD rev. 1.19
>>>
>>> fixes PR lib/55808 by Roland Illig
>>>
>>
>> While there it would be cool to apply FreeBSD and OpenBSD hardening for
>> invalid size x nmemb, checking for overflow. I propose to return EINVAL
>> in such case.
> 
> That makes no sense. Just turn them into a short read, which is
> something users have to deal with anyway.
> 
> Joerg
> 

The purpose of this errno is to catch insage usage of API that in normal
circumstances never makes sense. With overflows (that are fine), the
error can be unnoticed.

Both FreeBSD and OpenBSD found this behavior as useful.


Re: CVS commit: src/lib/libc/stdio

2021-01-31 Thread Joerg Sonnenberger
On Sun, Jan 31, 2021 at 05:27:28PM +0100, Kamil Rytarowski wrote:
> On 31.01.2021 17:18, Jaromir Dolecek wrote:
> > Module Name:src
> > Committed By:   jdolecek
> > Date:   Sun Jan 31 16:18:22 UTC 2021
> > 
> > Modified Files:
> > src/lib/libc/stdio: fread.c
> > 
> > Log Message:
> > for unbuffered I/O arrange for the destination buffer to be filled in one
> > go, instead of triggering long series of 1 byte read(2)s; this speeds up
> > fread() several order of magnitudes for this case, directly proportional
> > to the size of the supplied buffer
> > 
> > change adapted from OpenBSD rev. 1.19
> > 
> > fixes PR lib/55808 by Roland Illig
> > 
> 
> While there it would be cool to apply FreeBSD and OpenBSD hardening for
> invalid size x nmemb, checking for overflow. I propose to return EINVAL
> in such case.

That makes no sense. Just turn them into a short read, which is
something users have to deal with anyway.

Joerg


Re: CVS commit: src/lib/libc/stdio

2021-01-31 Thread Kamil Rytarowski
On 31.01.2021 17:18, Jaromir Dolecek wrote:
> Module Name:  src
> Committed By: jdolecek
> Date: Sun Jan 31 16:18:22 UTC 2021
> 
> Modified Files:
>   src/lib/libc/stdio: fread.c
> 
> Log Message:
> for unbuffered I/O arrange for the destination buffer to be filled in one
> go, instead of triggering long series of 1 byte read(2)s; this speeds up
> fread() several order of magnitudes for this case, directly proportional
> to the size of the supplied buffer
> 
> change adapted from OpenBSD rev. 1.19
> 
> fixes PR lib/55808 by Roland Illig
> 

While there it would be cool to apply FreeBSD and OpenBSD hardening for
invalid size x nmemb, checking for overflow. I propose to return EINVAL
in such case.

I planned to submit it a while ago for discussion.


Re: CVS commit: src/lib/libc/string

2020-04-05 Thread Robert Elz
Date:Sat, 4 Apr 2020 21:16:45 +
From:David Holland 
Message-ID:  <20200404211645.ga19...@netbsd.org>

  | I fail to see any scenario in which it's legitimate for an application
  | to scribble in internal data belonging to libc. Why should this be
  | permitted?

You've never done something like

p = getenv("PATH"); /* NULL check omitted here */
while (q = strchr(p, ':')) {
*q = '\0';
/* use p to do a lookup, printf, or whatever */
*q = ':';
p = q + 1;
}
/* use p here too */

??

If you have, what's the difference, genenv() is also returning data
from libc, it could also be const char *, but isn't.

In neither case is this what would normally be called "internal data"
though, it isn't as if we're directly modifying the data structs malloc()
uses (that would be internal data) - these are the results returned from
libc to the application.

  | I don't understand. It is a bug that the library returns a writeable
  | pointer to data that should not be modified.

But we aren't returning a pointer to data that can't be modified,
libc doesn't care, one way or the other.   Apps shouldn't modify it
(shouldn't attempt to) because of portability concerns with other
implementations.If we did as Joerg suggested and mmapped
the message catalog pages for the relevant locale, and simply returned
a pointer to the relevant entry, one of those could be us.But isn't.

  | Anyway, this is moot because strerror is defined by C;

That was my point.   The entry in the BUGS section of the man page was
merely a rant, as this isn't (wasn't ever) something that can be fixed.

  | but because it *should* be const,
  | it should be (and is) declared with __aconst in string.h.
  |
  | Please don't change that.

Not planning anything like that - this was all about the man page.

  | Also, perhaps we should swipe the text about not modifyingthe string
  | buffer from strsignal.3.

I believe that what is in strerror(3) now is more or less aligned
with the intent (if not the actual language) of that.

Note that strsignal(3) doesn't contain a BUGS section ranting about
how it should really be const char *strsignal().

kre




Re: CVS commit: src/lib/libc/string

2020-04-04 Thread David Holland
On Thu, Mar 26, 2020 at 10:54:21AM +0700, Robert Elz wrote:
 >   | I don't agree -- because applications shouldn't attempt to modify the
 >   | result, it should be const.
 > 
 > The only reason apps shouldn't modify the string is in case of
 > porting the app to an (well, some) ancient implementation.  Because
 > of the NLS requirements, the message these days (any modern
 > implementation) must be read from some external file - which means
 > the storage for it must be writable.  Nothing else (except the
 > calling thread) cares about the content of the returned message, so
 > there's no actual reason for the app to not modify it now if for
 > some wacky reason it wants to.

I fail to see any scenario in which it's legitimate for an application
to scribble in internal data belonging to libc. Why should this be
permitted?

 > The entry that was in BUGS wasn't a bug - what would be a bug would
 > be if we actually made strerror() return const char * - it wasn't
 > even a limitation (which we traditionally include in BUGS, even
 > though they're generally by design, and not accident, and not
 > really intended to be "fixed") - what it was was a rant about the
 > original design spec, and what's more, one which is no longer
 > warranted.

I don't understand. It is a bug that the library returns a writeable
pointer to data that should not be modified.

Anyway, this is moot because strerror is defined by C; but because it
*should* be const, it should be (and is) declared with __aconst in
string.h.

Please don't change that.

Also, perhaps we should swipe the text about not modifyingthe string
buffer from strsignal.3.

-- 
David A. Holland
dholl...@netbsd.org


Re: CVS commit: src/lib/libc/string

2020-03-26 Thread Joerg Sonnenberger
On Thu, Mar 26, 2020 at 10:54:21AM +0700, Robert Elz wrote:
> Date:Wed, 25 Mar 2020 20:51:25 +
> From:David Holland 
> Message-ID:  <20200325205125.ga11...@netbsd.org>
> 
>   | I don't agree -- because applications shouldn't attempt to modify the
>   | result, it should be const.
> 
> The only reason apps shouldn't modify the string is in case of porting
> the app to an (well, some) ancient implementation.  Because of the NLS
> requirements, the message these days (any modern implementation) must be
> read from some external file - which means the storage for it must be
> writable. 

Actually, the only reason why we really need writable space is the
unknown error case. NLS could in principle be using mmaped data as well,
modulo not being able to unmap it again.

Joerg


Re: CVS commit: src/lib/libc/string

2020-03-25 Thread Robert Elz
Date:Wed, 25 Mar 2020 20:51:25 +
From:David Holland 
Message-ID:  <20200325205125.ga11...@netbsd.org>

  | I don't agree -- because applications shouldn't attempt to modify the
  | result, it should be const.

The only reason apps shouldn't modify the string is in case of porting
the app to an (well, some) ancient implementation.  Because of the NLS
requirements, the message these days (any modern implementation) must be
read from some external file - which means the storage for it must be
writable.   Nothing else (except the calling thread) cares about the content
of the returned message, so there's no actual reason for the app to not
modify it now if for some wacky reason it wants to.

Note that when strerror_l() was created, the opportunity to make that
return const char * existed - const existed in C by then, which it didn't
when strerror() was invented - but wasn't taken, as it was clear by its
very design that strerror_l had to return a pointer to writable memory,
so it was made char * and not const char *.

  | What it points to internally doesn't matter...

True, if the interface had been to return const char *.

The entry that was in BUGS wasn't a bug - what would be a bug would be
if we actually made strerror() return const char * - it wasn't even a 
limitation (which we traditionally include in BUGS, even though they're
generally by design, and not accident, and not really intended to be "fixed")
- what it was was a rant about the original design spec, and what's more,
one which is no longer warranted.

kre

ps: I've never seen, and cannot really imagine, an app that would ever want
to modify the result from strerror(), so none of this really matters anyway.



Re: CVS commit: src/lib/libc/string

2020-03-25 Thread David Holland
On Wed, Mar 25, 2020 at 06:50:47PM +, Robert Elz wrote:
 > Modified Files:
 >  src/lib/libc/string: strerror.3
 > 
 > Log Message:
 > Delete the BUGS paragraph about the "missing" const qualifier for the
 > result type of strerror() (and strerror_l()).   While that once should
 > really have been present, when strerror() was invented, there was no
 > "const" qualifier in C to apply, and now the way the code is writtem
 > (really needs to be because of NLS support) the const is no longer
 > really appropriate.
 > 
 > Applications still shouldn't attempt to modify the result however.

I don't agree -- because applications shouldn't attempt to modify the
result, it should be const.

What it points to internally doesn't matter...

-- 
David A. Holland
dholl...@netbsd.org


Switch to compiler_rt from assembler codes? (Re: CVS commit: src/lib/libc/compiler_rt)

2020-03-10 Thread Rin Okuyama

(added port-sun2@)

On 2020/03/09 3:33, Joerg Sonnenberger wrote:

On Sun, Mar 08, 2020 at 06:30:06AM +, Rin Okuyama wrote:

Module Name:src
Committed By:   rin
Date:   Sun Mar  8 06:30:06 UTC 2020

Modified Files:
src/lib/libc/compiler_rt: Makefile.inc

Log Message:
Fix broken printf(3) %d output for numbers more than two digits, e.g.,

   printf("%d\n", 42) ---> "::" instead of "42"

Our __{,u}modsi3 codes assume that __udivsi3 returns remainder to
%d1 (volatile register). __udivsi3 in libgcc does not, and therefore
mixing them up results in mess.


This looks like the wrong fix to me. This is not about libgcc at all.
The depency in the src/common code should be fixed IMO. If __{,u}modsi3
wants to make assumptions about the remainder being implicitly computed,
it really should be using the divmod primitive instead.


Strictly speaking, this is bug in __{,u}modsi3. However:

(1) This is for optimization. There is no need for stack push and pop if
they can use our special version of __udivsi3, instead of __divmodsi4.
If they were ``fixed'' correctly, the resulted codes would have no big
difference from the C version provided by compiler_rt.

(2) Problems occur only when they are mixed with libgcc version of
__udivsi3. For standalone programs, i.e., kernel and bootloader, this
``wrong'' version works well.

(3) They are used only for m68000 (68010; sun2). For m68k (68020 and
later; other m68k ports), GCC does not generate references to these
functions at all.

Therefore, I propose:

(A) Continue to use these assembler codes only in standalone programs
(kernel and bootloader). For userland (libc), switch to the C version
provided by compiler_rt. Again, this affects only for 68010, i.e., sun2,
and not for other m68k ports.

Alternatively, we can:

(b) Switch both standalone and userland to compiler_rt.

I've checked both (A) and (B) version of live-image work fine on TME.
I prefer (A) over (B), since there is no problem if they are used only
in a controlled situation. Also, we are suffering from increase in
kernel size on sun2. But how do you everyone think?

Thanks,
rin


Re: CVS commit: src/lib/libc/compiler_rt

2020-03-08 Thread Joerg Sonnenberger
On Sun, Mar 08, 2020 at 06:30:06AM +, Rin Okuyama wrote:
> Module Name:  src
> Committed By: rin
> Date: Sun Mar  8 06:30:06 UTC 2020
> 
> Modified Files:
>   src/lib/libc/compiler_rt: Makefile.inc
> 
> Log Message:
> Fix broken printf(3) %d output for numbers more than two digits, e.g.,
> 
>   printf("%d\n", 42) ---> "::" instead of "42"
> 
> Our __{,u}modsi3 codes assume that __udivsi3 returns remainder to
> %d1 (volatile register). __udivsi3 in libgcc does not, and therefore
> mixing them up results in mess.

This looks like the wrong fix to me. This is not about libgcc at all.
The depency in the src/common code should be fixed IMO. If __{,u}modsi3
wants to make assumptions about the remainder being implicitly computed,
it really should be using the divmod primitive instead.

Joerg


Re: CVS commit: src/lib/libc/compiler_rt

2020-03-07 Thread Rin Okuyama

Oops, forgot to note that this is only for 68010, and other
systems including m68k are not affected.

Thanks,
rin

On 2020/03/08 15:30, Rin Okuyama wrote:

Module Name:src
Committed By:   rin
Date:   Sun Mar  8 06:30:06 UTC 2020

Modified Files:
src/lib/libc/compiler_rt: Makefile.inc

Log Message:
Fix broken printf(3) %d output for numbers more than two digits, e.g.,

   printf("%d\n", 42) ---> "::" instead of "42"

Our __{,u}modsi3 codes assume that __udivsi3 returns remainder to
%d1 (volatile register). __udivsi3 in libgcc does not, and therefore
mixing them up results in mess.


To generate a diff of this commit:
cvs rdiff -u -r1.36 -r1.37 src/lib/libc/compiler_rt/Makefile.inc

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.


Modified files:

Index: src/lib/libc/compiler_rt/Makefile.inc
diff -u src/lib/libc/compiler_rt/Makefile.inc:1.36 
src/lib/libc/compiler_rt/Makefile.inc:1.37
--- src/lib/libc/compiler_rt/Makefile.inc:1.36  Tue Oct 29 16:08:50 2019
+++ src/lib/libc/compiler_rt/Makefile.inc   Sun Mar  8 06:30:06 2020
@@ -1,4 +1,4 @@
-# $NetBSD: Makefile.inc,v 1.36 2019/10/29 16:08:50 joerg Exp $
+# $NetBSD: Makefile.inc,v 1.37 2020/03/08 06:30:06 rin Exp $
  
  COMPILER_RT_SRCDIR=	${NETBSDSRCDIR}/sys/external/bsd/compiler_rt/dist
  
@@ -170,9 +170,16 @@ GENERIC_SRCS+= \

  GENERIC_SRCS+= \
divmodsi4.c \
divsi3.c \
-   modsi3.c \
udivmodsi4.c \
+
+. if ${LIBC_MACHINE_ARCH} != "m68000"
+# Our __{,u}modsi3 codes assume that __udivsi3 returns remainder to
+# %d1 (volatile register). __udivsi3 in libgcc does not, and therefore
+# mixing them up results in mess.
+GENERIC_SRCS+= \
+   modsi3.c \
umodsi3.c
+. endif
  
  . if ${LIBC_MACHINE_CPU} != "sh3"

  # On sh3 __udivsi3 is gcc "millicode" with special calling convention



Re: CVS commit: src/lib/libc/stdlib

2020-02-23 Thread Valery Ushakov
On Sun, Feb 23, 2020 at 10:57:28 +0100, Kamil Rytarowski wrote:

> On 23.02.2020 08:46, Martin Husemann wrote:
> 
> > Source code consistency is a very important stylistic plus, every break of
> > that should be accompanied by a comment.
>
> Done.

Thank you.

-uwe


Re: CVS commit: src/lib/libc/stdlib

2020-02-23 Thread Kamil Rytarowski
On 23.02.2020 08:46, Martin Husemann wrote:
> On Sun, Feb 23, 2020 at 03:35:19AM +0100, Kamil Rytarowski wrote:
>> Algorithm would be changed from calculating on 32bit numbers with signed
>> integer overflows to an algorithm calculating on 64bit numbers. The
>> __dorand48() function truncates the result to least significant 16bits
>> only so it does not matter. I retained operations on 32bits avoiding
>> changes of types for stylistic reasons.
> 
> I am with uwe here - either it would not make any difference at all (on
> 32bit architectures) or it would end up with the same results and would
> make no performance difference (on 64 bit architectures), so going with
> the consistent (unsigned long) would have been fine.
> 
> Even better would be a cleanup to make it (uint32_t) everwhere, but of
> course only after carefull examination.
> 
> Source code consistency is a very important stylistic plus, every break of
> that should be accompanied by a comment.
> 
> Martin
> 

Done.



signature.asc
Description: OpenPGP digital signature


Re: CVS commit: src/lib/libc/stdlib

2020-02-22 Thread Martin Husemann
On Sun, Feb 23, 2020 at 03:35:19AM +0100, Kamil Rytarowski wrote:
> Algorithm would be changed from calculating on 32bit numbers with signed
> integer overflows to an algorithm calculating on 64bit numbers. The
> __dorand48() function truncates the result to least significant 16bits
> only so it does not matter. I retained operations on 32bits avoiding
> changes of types for stylistic reasons.

I am with uwe here - either it would not make any difference at all (on
32bit architectures) or it would end up with the same results and would
make no performance difference (on 64 bit architectures), so going with
the consistent (unsigned long) would have been fine.

Even better would be a cleanup to make it (uint32_t) everwhere, but of
course only after carefull examination.

Source code consistency is a very important stylistic plus, every break of
that should be accompanied by a comment.

Martin


Re: CVS commit: src/lib/libc/stdlib

2020-02-22 Thread Valery Ushakov
On Sun, Feb 23, 2020 at 03:35:19 +0100, Kamil Rytarowski wrote:

> On 23.02.2020 03:20, Valery Ushakov wrote:
> > On Sun, Feb 23, 2020 at 02:51:49 +0100, Kamil Rytarowski wrote:
> > 
> >> On 23.02.2020 02:29, Valery Ushakov wrote:
> >>> On Sat, Feb 22, 2020 at 14:07:57 +, Kamil Rytarowski wrote:
> >>>
>  Module Name: src
>  Committed By:kamil
>  Date:Sat Feb 22 14:07:57 UTC 2020
> 
>  Modified Files:
>   src/lib/libc/stdlib: _rand48.c
> 
>  Log Message:
>  Avoid undefined behavior in the rand48(3) implementation
> 
>  Instead of implicid promotion to signed int,
>  explicitly cast the arguments to unsigned int.
> >>>
> >>> Please, please, please, pay at least some attention to what is going
> >>> on around the code you are changing.
> >>>
> >>> If there's already code in this function that does:
> >>>
> >>>accu = (unsigned long) __rand48_mult[0] * (unsigned long) xseed[0];
> >>>
> >>> then keep it consistent and don't do casts to a different type
> >>>
> >>>accu += (unsigned int) __rand48_mult[0] * (unsigned int) xseed[2];
> >>
> >> cast to unsigned long still works, but changes algorithm. My change was
> >> performed deliberately. On the other hand and according to local tests
> >> the end-result for unsigned long produces the same reults as cast to
> >> unsigned int and unsigned char so it does not matter.
> > 
> > I cannot make sense of your answer.  Does the cast to unsigned long
> > there change the algorithm or does it produce the same result?  If it
> > produces the same result, then it should be used to be consistent with
> > the rest of the code (or the rest of the code changed to use unsigned
> > int).  If it does change the result, there should be a comment
> > explaining it.
> 
> Algorithm would be changed from calculating on 32bit numbers with signed
> integer overflows to an algorithm calculating on 64bit numbers. The
> __dorand48() function truncates the result to least significant 16bits
> only so it does not matter. I retained operations on 32bits avoiding
> changes of types for stylistic reasons.

That still doesn't make sense to me.  You took your time to figure out
whats going on in this bit of code.  Then you make a change that looks
extremely unobvious b/c it is inconsistent with the rest of the code,
and you say you did this for stylistic reasons.

The next person looking at that code (in $bignum years) will have to
waste their time puzzling out the reason.  Why not use the knowledge
you've gained of this code for good and change the code properly?  The
90s unsigned long was probably meant to be 32-bit anyway (cf. X11 mess
up with using long for 32 bit quantities in the protocol and then
running into issues when 64-bit happened).  So if doing things in
32-bit here is the right and intended thing to do, then change that
unsigned long to uint32_t.  If you don't want to dive that deep (which
is entirely understandable), then, exactly for stylistic reasons, cast
to unsigned long to be consistent with the old code - as you already
established that it didn't change the result.

-uwe


Re: CVS commit: src/lib/libc/stdlib

2020-02-22 Thread Kamil Rytarowski
On 23.02.2020 03:20, Valery Ushakov wrote:
> On Sun, Feb 23, 2020 at 02:51:49 +0100, Kamil Rytarowski wrote:
> 
>> On 23.02.2020 02:29, Valery Ushakov wrote:
>>> On Sat, Feb 22, 2020 at 14:07:57 +, Kamil Rytarowski wrote:
>>>
 Module Name:   src
 Committed By:  kamil
 Date:  Sat Feb 22 14:07:57 UTC 2020

 Modified Files:
src/lib/libc/stdlib: _rand48.c

 Log Message:
 Avoid undefined behavior in the rand48(3) implementation

 Instead of implicid promotion to signed int,
 explicitly cast the arguments to unsigned int.
>>>
>>> Please, please, please, pay at least some attention to what is going
>>> on around the code you are changing.
>>>
>>> If there's already code in this function that does:
>>>
>>>accu = (unsigned long) __rand48_mult[0] * (unsigned long) xseed[0];
>>>
>>> then keep it consistent and don't do casts to a different type
>>>
>>>accu += (unsigned int) __rand48_mult[0] * (unsigned int) xseed[2];
>>
>> cast to unsigned long still works, but changes algorithm. My change was
>> performed deliberately. On the other hand and according to local tests
>> the end-result for unsigned long produces the same reults as cast to
>> unsigned int and unsigned char so it does not matter.
> 
> I cannot make sense of your answer.  Does the cast to unsigned long
> there change the algorithm or does it produce the same result?  If it
> produces the same result, then it should be used to be consistent with
> the rest of the code (or the rest of the code changed to use unsigned
> int).  If it does change the result, there should be a comment
> explaining it.
> 
> -uwe
> 

Algorithm would be changed from calculating on 32bit numbers with signed
integer overflows to an algorithm calculating on 64bit numbers. The
__dorand48() function truncates the result to least significant 16bits
only so it does not matter. I retained operations on 32bits avoiding
changes of types for stylistic reasons.



signature.asc
Description: OpenPGP digital signature


Re: CVS commit: src/lib/libc/stdlib

2020-02-22 Thread Valery Ushakov
On Sun, Feb 23, 2020 at 02:51:49 +0100, Kamil Rytarowski wrote:

> On 23.02.2020 02:29, Valery Ushakov wrote:
> > On Sat, Feb 22, 2020 at 14:07:57 +, Kamil Rytarowski wrote:
> > 
> >> Module Name:   src
> >> Committed By:  kamil
> >> Date:  Sat Feb 22 14:07:57 UTC 2020
> >>
> >> Modified Files:
> >>src/lib/libc/stdlib: _rand48.c
> >>
> >> Log Message:
> >> Avoid undefined behavior in the rand48(3) implementation
> >>
> >> Instead of implicid promotion to signed int,
> >> explicitly cast the arguments to unsigned int.
> > 
> > Please, please, please, pay at least some attention to what is going
> > on around the code you are changing.
> > 
> > If there's already code in this function that does:
> > 
> >accu = (unsigned long) __rand48_mult[0] * (unsigned long) xseed[0];
> > 
> > then keep it consistent and don't do casts to a different type
> > 
> >accu += (unsigned int) __rand48_mult[0] * (unsigned int) xseed[2];
> 
> cast to unsigned long still works, but changes algorithm. My change was
> performed deliberately. On the other hand and according to local tests
> the end-result for unsigned long produces the same reults as cast to
> unsigned int and unsigned char so it does not matter.

I cannot make sense of your answer.  Does the cast to unsigned long
there change the algorithm or does it produce the same result?  If it
produces the same result, then it should be used to be consistent with
the rest of the code (or the rest of the code changed to use unsigned
int).  If it does change the result, there should be a comment
explaining it.

-uwe


Re: CVS commit: src/lib/libc/stdlib

2020-02-22 Thread Kamil Rytarowski
On 23.02.2020 02:29, Valery Ushakov wrote:
> On Sat, Feb 22, 2020 at 14:07:57 +, Kamil Rytarowski wrote:
> 
>> Module Name: src
>> Committed By:kamil
>> Date:Sat Feb 22 14:07:57 UTC 2020
>>
>> Modified Files:
>>  src/lib/libc/stdlib: _rand48.c
>>
>> Log Message:
>> Avoid undefined behavior in the rand48(3) implementation
>>
>> Instead of implicid promotion to signed int,
>> explicitly cast the arguments to unsigned int.
> 
> Please, please, please, pay at least some attention to what is going
> on around the code you are changing.
> 
> If there's already code in this function that does:
> 
>accu = (unsigned long) __rand48_mult[0] * (unsigned long) xseed[0];
> 
> then keep it consistent and don't do casts to a different type
> 
>accu += (unsigned int) __rand48_mult[0] * (unsigned int) xseed[2];
> 
> 
> -uwe
> 

cast to unsigned long still works, but changes algorithm. My change was
performed deliberately. On the other hand and according to local tests
the end-result for unsigned long produces the same reults as cast to
unsigned int and unsigned char so it does not matter.



signature.asc
Description: OpenPGP digital signature


Re: CVS commit: src/lib/libc/stdlib

2020-02-22 Thread Valery Ushakov
On Sat, Feb 22, 2020 at 14:07:57 +, Kamil Rytarowski wrote:

> Module Name:  src
> Committed By: kamil
> Date: Sat Feb 22 14:07:57 UTC 2020
> 
> Modified Files:
>   src/lib/libc/stdlib: _rand48.c
> 
> Log Message:
> Avoid undefined behavior in the rand48(3) implementation
> 
> Instead of implicid promotion to signed int,
> explicitly cast the arguments to unsigned int.

Please, please, please, pay at least some attention to what is going
on around the code you are changing.

If there's already code in this function that does:

   accu = (unsigned long) __rand48_mult[0] * (unsigned long) xseed[0];

then keep it consistent and don't do casts to a different type

   accu += (unsigned int) __rand48_mult[0] * (unsigned int) xseed[2];


-uwe


Re: CVS commit: src/lib/libc/gen

2020-02-03 Thread Kamil Rytarowski
On 03.02.2020 17:24, Kamil Rytarowski wrote:
> On 03.02.2020 17:21, Joerg Sonnenberger wrote:
>> On Sat, Feb 01, 2020 at 03:38:46PM +, Kamil Rytarowski wrote:
>>> Module Name:src
>>> Committed By:   kamil
>>> Date:   Sat Feb  1 15:38:46 UTC 2020
>>>
>>> Modified Files:
>>> src/lib/libc/gen: pthread_atfork.c
>>>
>>> Log Message:
>>> Switch atform allocations from malloc()+free() to mmap()+munmap()
>>>
>>> This avoid bootstrapping malloc too early when libc+libpthread are not
>>> ready. It is called through pthread__init() -> _pthread_atfork().
>>>
>>> This also helps LLVM Leak Sanitizer to pacify false positive reports.
>>
>> Can we please stop adding more and more hacks for the still questionable
>> "new" jemalloc and sit down for a sane init model first please? I'm
>> still at the point to ask to just revert to the known-to-be-working
>> version and therefore would also strongly prefer this change to be
>> reverted.
>>
>> Joerg
>>
> 
> This change was reverted as there was a fallout. Init model is sane
> except for pthread_atfork()+malloc() called prematurely in pthread__init().
> 

OK. The pthread_atfork(3) call was introduced with the following change.

Maybe we should fix Python?

commit 25c9c068b18fceac33cc6fec61a1277b5dc6107c
Author: joerg 
Date:   Wed Dec 25 00:44:45 2019 +

Since pthread_setspecific requires locks, ensure that they are acquired
before fork and dropped in both parent and child. At least Python
depends on TSD after fork, even though it is undefined behavior in
POSIX.

diff --git a/lib/libpthread/pthread_tsd.c b/lib/libpthread/pthread_tsd.c
index 666cbab85f1b..d6e261de0bb9 100644
--- a/lib/libpthread/pthread_tsd.c
+++ b/lib/libpthread/pthread_tsd.c
@@ -1,4 +1,4 @@
-/* $NetBSD: pthread_tsd.c,v 1.17 2019/03/05 01:35:52 christos Exp $
*/
+/* $NetBSD: pthread_tsd.c,v 1.18 2019/12/25 00:44:45 joerg Exp $   */

 /*-
  * Copyright (c) 2001, 2007 The NetBSD Foundation, Inc.
@@ -30,7 +30,7 @@
  */

 #include 
-__RCSID("$NetBSD: pthread_tsd.c,v 1.17 2019/03/05 01:35:52 christos Exp
$");
+__RCSID("$NetBSD: pthread_tsd.c,v 1.18 2019/12/25 00:44:45 joerg Exp $");

 /* Functions and structures dealing with thread-specific data */
 #include 
@@ -61,6 +61,18 @@ null_destructor(void *p)
 #include 
 #include 

+static void
+pthread_tsd_prefork(void)
+{
+   pthread_mutex_lock(_mutex);
+}
+
+static void
+pthread_tsd_postfork(void)
+{
+   pthread_mutex_unlock(_mutex);
+}
+
 void *
 pthread_tsd_init(size_t *tlen)
 {
@@ -68,6 +80,8 @@ pthread_tsd_init(size_t *tlen)
size_t alen;
char *arena;

+   pthread_atfork(pthread_tsd_prefork, pthread_tsd_postfork,
pthread_tsd_postfork);
+
if ((pkm = pthread__getenv("PTHREAD_KEYS_MAX")) != NULL) {
pthread_keys_max = (int)strtol(pkm, NULL, 0);
if (pthread_keys_max < _POSIX_THREAD_KEYS_MAX)







signature.asc
Description: OpenPGP digital signature


Re: CVS commit: src/lib/libc/gen

2020-02-03 Thread Kamil Rytarowski
On 03.02.2020 17:21, Joerg Sonnenberger wrote:
> On Sat, Feb 01, 2020 at 03:38:46PM +, Kamil Rytarowski wrote:
>> Module Name: src
>> Committed By:kamil
>> Date:Sat Feb  1 15:38:46 UTC 2020
>>
>> Modified Files:
>>  src/lib/libc/gen: pthread_atfork.c
>>
>> Log Message:
>> Switch atform allocations from malloc()+free() to mmap()+munmap()
>>
>> This avoid bootstrapping malloc too early when libc+libpthread are not
>> ready. It is called through pthread__init() -> _pthread_atfork().
>>
>> This also helps LLVM Leak Sanitizer to pacify false positive reports.
> 
> Can we please stop adding more and more hacks for the still questionable
> "new" jemalloc and sit down for a sane init model first please? I'm
> still at the point to ask to just revert to the known-to-be-working
> version and therefore would also strongly prefer this change to be
> reverted.
> 
> Joerg
> 

This change was reverted as there was a fallout. Init model is sane
except for pthread_atfork()+malloc() called prematurely in pthread__init().



signature.asc
Description: OpenPGP digital signature


Re: CVS commit: src/lib/libc/gen

2020-02-03 Thread Joerg Sonnenberger
On Sat, Feb 01, 2020 at 03:38:46PM +, Kamil Rytarowski wrote:
> Module Name:  src
> Committed By: kamil
> Date: Sat Feb  1 15:38:46 UTC 2020
> 
> Modified Files:
>   src/lib/libc/gen: pthread_atfork.c
> 
> Log Message:
> Switch atform allocations from malloc()+free() to mmap()+munmap()
> 
> This avoid bootstrapping malloc too early when libc+libpthread are not
> ready. It is called through pthread__init() -> _pthread_atfork().
> 
> This also helps LLVM Leak Sanitizer to pacify false positive reports.

Can we please stop adding more and more hacks for the still questionable
"new" jemalloc and sit down for a sane init model first please? I'm
still at the point to ask to just revert to the known-to-be-working
version and therefore would also strongly prefer this change to be
reverted.

Joerg


Re: CVS commit: src/lib/libc/sys

2020-01-06 Thread Christos Zoulas
In article <20200104044017.c44aef...@cvs.netbsd.org>,
Kamil Rytarowski  wrote:
>-=-=-=-=-=-
>
>Module Name:   src
>Committed By:  kamil
>Date:  Sat Jan  4 04:40:17 UTC 2020
>
>Modified Files:
>   src/lib/libc/sys: ptrace.2
>
>Log Message:
>/tmp/cvsbigmGa

You need to read the commit message in, not pass the name of the file!

christos



Re: CVS commit: src/lib/libc/sys

2020-01-04 Thread Kamil Rytarowski
On 04.01.2020 05:40, Kamil Rytarowski wrote:
> Module Name:  src
> Committed By: kamil
> Date: Sat Jan  4 04:40:17 UTC 2020
> 
> Modified Files:
>   src/lib/libc/sys: ptrace.2
> 
> Log Message:
> /tmp/cvsbigmGa
> 


Document PT_LWPSTATUS and PT_LWPNEXT in ptrace(2)

Remove mentions of obsolete PT_LWPINFO.


> 
> To generate a diff of this commit:
> cvs rdiff -u -r1.82 -r1.83 src/lib/libc/sys/ptrace.2
> 
> Please note that diffs are not public domain; they are subject to the
> copyright notices on the relevant files.
> 
> 
> Modified files:
> 
> Index: src/lib/libc/sys/ptrace.2
> diff -u src/lib/libc/sys/ptrace.2:1.82 src/lib/libc/sys/ptrace.2:1.83
> --- src/lib/libc/sys/ptrace.2:1.82Wed Oct  9 14:20:47 2019
> +++ src/lib/libc/sys/ptrace.2 Sat Jan  4 04:40:17 2020
> @@ -1,7 +1,7 @@
> -.\"  $NetBSD: ptrace.2,v 1.82 2019/10/09 14:20:47 wiz Exp $
> +.\"  $NetBSD: ptrace.2,v 1.83 2020/01/04 04:40:17 kamil Exp $
>  .\"
>  .\" This file is in the public domain.
> -.Dd October 9, 2019
> +.Dd January 4, 2019
>  .Dt PTRACE 2
>  .Os
>  .Sh NAME
> @@ -399,7 +399,7 @@ argument should contain the name of the 
>  and the
>  .Fa data
>  argument should contain the length of the core filename.
> -.It Dv PT_LWPINFO
> +.It Dv PT_LWPSTATUS
>  Returns information about a thread from the list of threads for the
>  process specified in the
>  .Fa pid
> @@ -407,41 +407,50 @@ argument.
>  The
>  .Fa addr
>  argument should contain a
> -.Vt struct ptrace_lwpinfo
> +.Vt struct ptrace_lwpstatus
>  defined as:
>  .Bd -literal -offset indent
> -struct ptrace_lwpinfo {
> +struct ptrace_lwpstatus {
>   lwpid_t pl_lwpid;
> - int pl_event;
> + sigset_t pl_sigpend;
> + sigset_t pl_sigmask;
> + char pl_name[20];
> + void *pl_private;
>  };
>  .Ed
>  .Pp
>  where
>  .Fa pl_lwpid
>  contains a thread LWP ID.
> -Information is returned for the thread following the one with the
> +Information is returned for the thread specified in
> +.Fa pl_lwpid .
> +.Fa pl_sigpend
> +contains the signals pending on that LWP.
> +.Fa pl_sigmask
> +contains the signals masked on that LWP.
> +.Fa pl_name
> +contains printable name of the LWP.
> +The string is always NUL terminated.
> +.Fa pl_private
> +contains the pointer to TLS base.
> +.Pp
> +The
> +.Fa data
> +argument should contain
> +.Dq Li sizeof(struct ptrace_lwpinfo) .
> +.It Dv PT_LWPNEXT
> +Is the same as
> +.Dv PT_LWPSTATUS ,
> +except that information is returned for the thread following the one with the
>  specified ID in the process thread list, or for the first thread
>  if
>  .Fa pl_lwpid
>  is 0.
> +.Pp
>  Upon return
>  .Fa pl_lwpid
>  contains the LWP ID of the thread that was found, or 0 if there is
>  no thread after the one whose LWP ID was supplied in the call.
> -.Fa pl_event
> -contains the event that stopped the thread.
> -Possible values are:
> -.Pp
> -.Bl -tag -width 30n -offset indent -compact
> -.It Dv PL_EVENT_NONE
> -.It Dv PL_EVENT_SIGNAL
> -.It Dv PL_EVENT_SUSPENDED
> -.El
> -.Pp
> -The
> -.Fa data
> -argument should contain
> -.Dq Li sizeof(struct ptrace_lwpinfo) .
>  .It Dv PT_SYSCALL
>  Stops a process before and after executing each system call.
>  Otherwise this operation is the same as
> @@ -987,10 +996,3 @@ to
>  .Fn ptrace
>  .Ec ,
>  should be able to sidestep this.
> -.Pp
> -.Dv PT_SET_SIGINFO ,
> -.Dv PT_RESUME
> -and
> -.Dv PT_SUSPEND
> -can change the image of process returned by
> -.Dv PT_LWPINFO .
> 




signature.asc
Description: OpenPGP digital signature


Re: CVS commit: src/lib/libc/tls

2019-11-22 Thread Takeshi Nakayama
>>> Joerg Sonnenberger  wrote

> 
> On Fri, Nov 22, 2019 at 08:53:25AM +0900, Takeshi Nakayama wrote:
> > >>> Takeshi Nakayama  wrote
> > 
> > > >>> Joerg Sonnenberger  wrote
> > > 
> > > > On Thu, Nov 21, 2019 at 11:06:16PM +, Takeshi Nakayama wrote:
> > > > > Module Name:  src
> > > > > Committed By: nakayama
> > > > > Date: Thu Nov 21 23:06:16 UTC 2019
> > > > > 
> > > > > Modified Files:
> > > > >   src/lib/libc/tls: tls.c
> > > > > 
> > > > > Log Message:
> > > > > Fix PR/54074 and PR/54093 completely.
> > > > > 
> > > > > More similar to the ld.elf_so logic, it is necessary to align with
> > > > > p_align first.  Also, invert the #ifdef condition for consistency.
> > > > 
> > > > This commit just wastes space without reason.
> > > 
> > > No, without this commit the TLS offset calculation is mismatch if
> > > alignof(max_align_t)) != p_align.
> > 
> > Ah, this is wrong.
> > Correctly, the TLS offset calculation is mismatch with what ld(1)
> > expected if p_memsz is not aligned with p_align.
> 
> For TLS variant I, it literally just adds padding at the end of the
> allocation. For TLS variant II, it is redundant, because the rounding is
> already done in with max_align_t. We do not support p_align > malloc
> alignment and the patch is certainly nowhere near enough to change that.
> It is actively harmful in that it can make people believe that it could
> ever work.

I don't want to do anything about the case of p_align > malloc
alignment.  I just want to fix an usual 32-bit sparc statically
linked binaries not working properly.

The new jemalloc now uses TLS, and the 32-bit sparc statically
linked binary ELF header looks like this:

% readelf -l rescue/sh | egrep 'MemSiz|TLS' 
  Type   Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  TLS0x6dc000 0x0075c000 0x0075c000 0x00b10 0x00b14 R   0x8

The point is when MemSiz (p_memsz) is not aligned with Align
(p_align) as above.

In tls.c rev 1.13:
  tls_size = p_memsz --> 0x00b14
  tls_allocation = roundup2(tls_size, alignof(max_align_t)) --> 0x00b18
  tcb = calloc() + tls_allocation --> calloc() + 0x00b18
  p = tcb - tls_size --> calloc() + 4
  memcpy(p, tls_initaddr, tls_initsize)

So, the TLS initialization image is copied to calloc() + 4.
However, ld(1) assumes that the TLS area starts with:
  tcb - roundup2(p_memsz, p_align) = tcb - 0x00b18 --> calloc()

Since this is 4 bytes different, the initial value of the thread
local variable is not set correctly.

To fix this problem, I added the same thing as this line in ld.elf_so:

  https://nxr.netbsd.org/xref/src/libexec/ld.elf_so/tls.c#236

* obj->tlssize = p_memsz, obj->tlsalign = p_align

-- Takeshi Nakayama


  1   2   3   4   5   >