Re: KSYMS_CLOSEST

2022-12-25 Thread Anders Magnusson

Den 2022-12-25 kl. 17:25, skrev Valery Ushakov:

On Sun, Dec 25, 2022 at 15:42:47 +0100, Anders Magnusson wrote:


Den 2022-12-25 kl. 13:43, skrev Valery Ushakov:

On Sun, Dec 25, 2022 at 09:20:49 +0100, Anders Magnusson wrote:


IIRC it was to match the ddb "sift" command.

I'm not sure I get how it might be used for sifting - a kind of "next"
for external iteration?  Since we never got around to do that do we
still want to keep it, or shall we deprecate/delete it?

Ah! I had to look at the code - no, it has nothing to do with sift.
I think it is implicit when asking for a name these days; it is used
to get nearest lower address address in debug output. (like
tstile+0x18 )

Right, right, but I wonder what could it possibly mean then, when the
flag is not specified - as opposed to the example above.  I.e. if
KSYMS_CLOSEST is foo+0x10, what KSYMS_EXTERN (i.e. no specific flags)
could be, other than foo+0x10, for the same address?  I mean,
technically, netbsd + 0xcaffe42 would also be a correct reply in that
case :)
:-)  If you are not specifying KSYMS_EXACT, you may not get the exact 
address, yes.  That is true :-)



Also, checking the very first versions of ksyms code I don't see
KSYMS_CLOSEST ever actually handled (it's defined and specified in the
ddb strategy defines, but never tested in ksyms).  May be I missed
some later short-lived incarnation.

The existing call sites that supply the flag look like cargo-cult^W^W
common sense ("looks like you might need to specify that flag to get
foo+0x10, well, *shrug*, won't hurt").

I assume that might be the case, yes.
The ksyms code comes from another system for which I wrote it a long 
time ago, where the meaning may have had a significance (do not remember).
But feel free to clean this up.  (IMHO KSYMS_EXACT should be the 
default, requiring KSYMS_CLOSEST to be defined if that is requested).


-- Ragge


Re: KSYMS_CLOSEST

2022-12-25 Thread Anders Magnusson

Den 2022-12-25 kl. 13:43, skrev Valery Ushakov:

On Sun, Dec 25, 2022 at 09:20:49 +0100, Anders Magnusson wrote:


IIRC it was to match the ddb "sift" command.

I'm not sure I get how it might be used for sifting - a kind of "next"
for external iteration?  Since we never got around to do that do we
still want to keep it, or shall we deprecate/delete it?

Ah! I had to look at the code - no, it has nothing to do with sift.
I think it is implicit when asking for a name these days; it is used to 
get nearest lower address address in debug output. (like tstile+0x18 )


Yes, I think it can be removed, but the DDB code must be cleaned up as 
well since it seem that it uses it in a bunch of places.


-- R






Den 2022-12-25 kl. 01:01, skrev Valery Ushakov:

KSYMS_CLOSEST flag is documented as "Nearest lower match".  However as
far as I can tell nothing in ksyms code ever pays attention to this
flag and it's not clear to me what meaning one can ascribe to the set
of flags that doesn't have KSYMS_CLOSEST set.

Ragge, do you remember what did you have in mind for it when you
introduced it back in 2003?

I think we should g/c it.

-uwe

-uwe




Re: KSYMS_CLOSEST

2022-12-25 Thread Anders Magnusson

IIRC it was to match the ddb "sift" command.

Den 2022-12-25 kl. 01:01, skrev Valery Ushakov:

KSYMS_CLOSEST flag is documented as "Nearest lower match".  However as
far as I can tell nothing in ksyms code ever pays attention to this
flag and it's not clear to me what meaning one can ascribe to the set
of flags that doesn't have KSYMS_CLOSEST set.

Ragge, do you remember what did you have in mind for it when you
introduced it back in 2003?

I think we should g/c it.

-uwe




Re: #pragma once

2022-10-16 Thread Anders Magnusson

Den 2022-10-16 kl. 00:17, skrev matthew green:

it seems that pcc is missing '#pragma once' support.  at least,
the version in src.

ragge, can you fix it? :-)  thanks.

Hehe :-)  No problem, should be close to trivial.
I think best way to detect the same file would be to just save the inode 
(and device) of a "once" file.


-- R



Re: pcc [was Re: valgrind]

2022-03-26 Thread Anders Magnusson

Den 2022-03-22 kl. 18:47, skrev Koning, Paul:



On Mar 22, 2022, at 1:21 PM, Greg A. Woods  wrote:

At Mon, 21 Mar 2022 08:54:43 -0400 (EDT), Mouse  
wrote:
Subject: pcc [was Re: valgrind]

I've been making very-spare-time progress on building my own
compiler on and off for some years now; perhaps I'll eventually get
somewhere.  [...]

Have you looked at pcc?  http://pcc.ludd.ltu.se/ and in our source
tree in src/external/bsd/pcc .

No, I haven't.  I should - it may well end up being quicker to move an
existing compiler in the directions I want to go than to write my own.

...
I also really like PCC.  (I remember teething pains getting used to it
back when it first replaced Ritchie C on my university's PDP-11/60, but
once I actually used it for real code (i.e. assignments in those days),
and soon on the Vax too, I really liked it.)

I'm really sad that I still cannot build NetBSD entirely with PCC as the
native and only default compiler.

Out of curiosity: how does PCC code quality compare with that of GCC and (for 
targets that it supports) Clang?
Last time I tested (maybe 8 years ago?) against gcc with bytebench 
(IIRC? some of the benchmarks anyway) on i386 it
generated code that was slightly smaller (gcc -O6) and on average 6% 
slower than gcc.


The slowliness was mostly due to the lack of strength reduction in loops.

-- R






Re: pcc [was Re: valgrind]

2022-03-26 Thread Anders Magnusson

Den 2022-03-22 kl. 18:49, skrev bch:



On Tue, Mar 22, 2022 at 10:21 Greg A. Woods  wrote:

At Mon, 21 Mar 2022 08:54:43 -0400 (EDT), Mouse
 wrote:
Subject: pcc [was Re: valgrind]
>
> >> I've been making very-spare-time progress on building my own
> >> compiler on and off for some years now; perhaps I'll
eventually get
> >> somewhere.  [...]
> > Have you looked at pcc? http://pcc.ludd.ltu.se/ and in our source
> > tree in src/external/bsd/pcc .
>
> No, I haven't.  I should - it may well end up being quicker to
move an
> existing compiler in the directions I want to go than to write
my own.

I would like to add my voice too.

I _really_ like valgrind.  It is immensely valuable and infinitely
better than any of the compiler so-called "sanitizers" (except
maybe the
Undefined Behaviour sanitizer in Clang, which, sadly, is a necessary
evil if one is to use such a modern language bastardizer like Clang).

It's a little ugly to use, and it's a very tough task-master, but I'm
really sad that I cannot use it easily and regularly on NetBSD.

(I'm just about as sad that it no longer works on modern macOS
either.)


I also really like PCC.  (I remember teething pains getting used to it
back when it first replaced Ritchie C on my university's
PDP-11/60, but
once I actually used it for real code (i.e. assignments in those
days),
and soon on the Vax too, I really liked it.)

I'm really sad that I still cannot build NetBSD entirely with PCC
as the
native and only default compiler.



Is there a case for another concerted  campaign on pcc like ragge@ did 
those years ago?

If I get help with finding out what do not work then I can easily fix it.
Just fetch the latest pcc and try out :-)

-- R






Re: pcc [was Re: valgrind]

2022-03-23 Thread Anders Magnusson

Den 2022-03-23 kl. 21:55, skrev Greg A. Woods:

At Wed, 23 Mar 2022 20:56:27 +0100, Anders Magnusson  
wrote:
Subject: Re: pcc [was Re: valgrind]

Den 2022-03-23 kl. 19:37, skrev Greg A. Woods:

Heh.  I would say PCC's generated code doesn't compare to either modern
GCC or LLVM/Clang's output.

I would say the main reason is PCC doesn't (as far as I know) employ
any "Undefined Behaviour" caveat to optimize code, for example.

I'll let the reader decide which might have the "higher" quality.

I would really want to know what you base these three statements on?

Well I've read a great deal of PDP-11 assembler as produced by PCC, and
I've fought with LLVM/Clang (and to a lesser extent with GCC) and their
undefined behaviour sanitizers (and valgrind) when trying to port old
code to these new compilers and to understand what they have done to it.

I also have way more experience than I ever really wanted in finding
bugs in a wide variety of compilers that are effectively from the same
era as PCC (e.g. especially Lattice C and early Microsoft C, which as I
recall started life as Lattice C).

Modern optimizers that take advantage of UB to do their thing can cause
very strange bugs (hidden bugs, when the UB sanitizer isn't used),
especially with legacy code, or indeed with modern code written by naive
programmers.

Note of course that I'm explicitly _not_ talking about the quality of
the _input_ code, but of the generated assembler code, and I'm assuming
that's what Paul was asking about.

One thing I don't have a good feel for though is how the code produced
by modern GCC and LLVM/Clang looks when they are told to "leave it
alone" after the first step, i.e. with "-O0", and especially as compared
to PCC with -O0.  I _think_ they should be about the same, but I dunno.
Although older compilers like PCC are very naive and simplistic in how
they generate code, my feeling is that modern compilers are even more
naive in their first step of code generation as they have come to rely
even more on their own optimizers to clean things up.  That's pure
speculation though -- I haven't worked directly with assembler code very
much at all since I left the likes of the 6502 and 8086 behind.

I get the feeling that you are talking about pcc as it were in 1979 or so?
You seem to be unaware that things have happened the last 40+ years... :-)

-- R



Re: pcc [was Re: valgrind]

2022-03-23 Thread Anders Magnusson

Den 2022-03-23 kl. 19:37, skrev Greg A. Woods:

At Tue, 22 Mar 2022 17:47:55 +, "Koning, Paul"  wrote:
Subject: Re: pcc [was Re: valgrind]


Out of curiosity: how does PCC code quality compare with that of
GCC and (for targets that it supports) Clang?

Heh.  I would say PCC's generated code doesn't compare to either modern
GCC or LLVM/Clang's output.

I would say the main reason is PCC doesn't (as far as I know) employ
any "Undefined Behaviour" caveat to optimize code, for example.

I'll let the reader decide which might have the "higher" quality.

I would really want to know what you base these three statements on?

-- R


Re: I think I've found why Xen domUs can't mount some file-backed disk images! (vnd(4) hides labels!)

2021-04-12 Thread Anders Magnusson

Den 2021-04-12 kl. 22:16, skrev i...@netbsd.org:

On Sun, Apr 11, 2021 at 10:15:18PM -, Michael van Elst wrote:


I have also seen winchester disks with 128 byte sectors, ESDI
disks with 576 byte sectors and CD-ROM XA media uses 2352 byte
sectors.

I've seen a washing-machine sized disk driver, used with 128-word
sectors (thus 4608 bits) on a PDP-10 installation, that was moved
after decomissioning the PDP-10 to a VAX 11/750, reformatted to
using 512-byte (4096 bit) sectors.

http://www.netbsd.org/ports/vax/picture.html

At the right there is a reformatted RP07 which was IIRC 115MW 36-bit, 
later 516MB on the 11/780 :-)


-- R


Re: Scheduling problem - need some help here

2020-07-28 Thread Anders Magnusson

Hi,

Den 2020-07-28 kl. 13:28, skrev Nick Hudson:

On 28/06/2020 16:11, Anders Magnusson wrote:

Hi,

there is a problem (on vax) that I do not really understand. Greg Oster
filed a PR on it (#55415).

A while ago ad@ removed the  "(ci)->ci_want_resched = 1;" from
cpu_need_resched() in vax/include/cpu.h.
And as I read the code (in kern_runq.c) it shouldn't be needed,
ci_want_resched should be set already when the macro cpu_need_resched()
is invoked.

But; without setting cpu_need_resched=1 the vax performs really bad (as
described in the PR).

cpu_need_resched may have multiple values nowadays, setting it to 1 will
effectively clear out other flags, which is probably what makes it work.

Anyone know what os going on here (and can explain it to me)?


I'm no expert here, but I think the expectation is that each platform
has its own method to signal "ast pending" and eventually call userret
(and preempt) when it's set - see setsoftast/aston.
VAX has hardware ASTs, (AST is actually a VAX operation), which works so 
that if an AST is requested, then next time an REI to userspace is 
executed it will get an AST trap instead and then reschedule.


As I don't understand vax I don't know what

197 #define cpu_signotify(l) mtpr(AST_OK,PR_ASTLVL)

is expected to do, but somehow it should result in userret() being 
called.
Yep, this is the way an AST is posted. Next time an REI is executed it 
will trap to the AST subroutine.


Other points are:

- vax cpu_need_resched doesn't seem to differentiate between locally
  running lwp and an lwp running on another cpu.
Most likely.  It was 20 years since I wrote the MP code (and probably 
the same since anyone tested it last time) and at that time LWPs didn't 
exist in NetBSD.  I would be surprised if it still worked :-)


- I can't see how hardclock would result in userret being called, but
  like I said - I don't know vax.
When it returns from hardclock (via REI) it directly traps to the AST 
handler instead if an AST is posted.
http://src.illumos.org/source/xref/netbsd-src/sys/arch/vax/vax/intvec.S#311 



I believe ci_want_resched is an MI variable for the scheduler which is
why its use in vax cpu_need_resched got removed.

It shouldn't be needed, but obviously something breaks if it isn't added.

What I think may have happened is that someone may have optimized 
something in the MI code that expects a different behaviour than the VAX 
hardware ASTs have.  AFAIK VAX is (almost) the only port that have 
hardware ASTs.


Thanks for at least looking at this.

-- Ragge


Scheduling problem - need some help here

2020-06-28 Thread Anders Magnusson

Hi,

there is a problem (on vax) that I do not really understand.  Greg Oster 
filed a PR on it (#55415).


A while ago ad@ removed the  "(ci)->ci_want_resched = 1;" from 
cpu_need_resched() in vax/include/cpu.h.
And as I read the code (in kern_runq.c) it shouldn't be needed, 
ci_want_resched should be set already when the macro cpu_need_resched() 
is invoked.


But; without setting cpu_need_resched=1 the vax performs really bad (as 
described in the PR).


cpu_need_resched may have multiple values nowadays, setting it to 1 will 
effectively clear out other flags, which is probably what makes it work.


Anyone know what os going on here (and can explain it to me)?

-- Ragge


Re: svr4, again

2018-12-21 Thread Anders Magnusson

Den 2018-12-20 kl. 21:29, skrev Maxime Villard:

Le 20/12/2018 à 18:11, Kamil Rytarowski a écrit :

https://github.com/krytarowski/franz-lisp-netbsd-0.9-i386

On the other hand unless we need it for bootloaders, drivers or
something needed to run NetBSD, I'm for removal of srv3, sunos etc 
compat.


Yes.

So, first things first, and to come back to my email about ibcs2: what 
are

the reasons for keeping it? As I said previously, this is not for x86 but
for Vax. As was also said, FreeBSD removed it just a few days ago.

I'm bringing up compat_ibcs2 because I did start a thread on port-vax@ 
about
it last year (as quoted earlier), and back then it seemed that no one 
knew

what was the use case on Vax.
It was something that Matt Thomas used for a customer running some 
commercial program,
but it was a long time ago (15 years?).  I've never heard of any other 
use, so from my perspective IBCS2 not relevant (anymore).


-- ragge


Re: Support for tv_sec=-1 (one second before the epoch) timestamps?

2018-12-13 Thread Anders Magnusson



The difference between times is an interval, or duration, not a time,
and should not be stored in a time_t, ever.

So, what type _should_ be used for it?


7.27.2.2 Thedifftime function

Synopsis

#include  double difftime(time_t time1, time_t time0);

Description

The difftime function computes the difference between two calendar 
times: time1 - time0.


Returns

The difftime function returns the difference expressed in seconds as a 
double.




Re: Missing compat_43 stuff for netbsd32?

2018-09-12 Thread Anders Magnusson

Den 2018-09-12 kl. 01:57, skrev Warner Losh:



On Tue, Sep 11, 2018, 5:48 PM Brad Spencer > wrote:


Eduardo Horvath  writes:

> On Tue, 11 Sep 2018, Paul Goyette wrote:
>
>> While working on the compat code, I noticed that there are a
few old
>> syscalls which are defined in syc/compat/netbsd323/syscalls.master
>> with a type of COMPAT_43, yet there does not exist any
compat_netbsd32
>> implementation as far as I can see...
>>
>>      #64     ogetpagesize
>>      #84     owait
>>      #89     ogetdtablesize
>>      #108    osigvec
>>      #142    ogethostid (interestingly, there _is_ an
implementation
>>                      for osethostid!)
>>      #149    oquota
>>
>> Does any of this really matter?  Should we attempt to implement
them?
>
> I believe COMPAT_43 is not NetBSD 4.3 it's BSD 4.3. Anybody have
any old
> BSD 4.3 80386 binaries they still run?  Did BSD 4.3 run on an
80386?  Did
> the 80386 even exist when Berkeley published BSD 4.3?
>
> It's probably only useful for running ancient SunOS 4.x
binaries, maybe
> Ultrix, Irix or OSF-1 depending on how closely they followed BSD
4.3.
>
> Eduardo


It has been a very long time since I did this, and I may not remember
correctly, but I believe that COMPAT_43 is needed on NetBSD/i386
to run
BSDI binaries.  I remember using the BSDI Netscape 3.x binary back in
the day and I think it was required.


FreeBSD does too... net2 was closer to 4.3 system calls for many 
things than 4.4.
When I wrote the vax port I used 4.3BSD Reno environment and NetBSD 
kernels with COMPAT_43.


Trivia:  I had two 11/750, one for compiling and one for test-booting, 
and used a dual-ported RP06

to get a test kernel in there quickly :-)

-- Ragge


Re: Too many PMC implementations

2018-08-23 Thread Anders Magnusson

Den 2018-08-23 kl. 17:09, skrev Kamil Rytarowski:

On 23.08.2018 16:59, Anders Magnusson wrote:

Den 2018-08-23 kl. 16:48, skrev Kamil Rytarowski:

On 23.08.2018 16:28, Anders Magnusson wrote:

Den 2018-08-23 kl. 15:53, skrev Maxime Villard:

Le 17/08/2018 à 17:42, Kamil Rytarowski a écrit :

On 17.08.2018 17:13, Maxime Villard wrote:

Note that I'm talking about the kernel gprof, and not the userland
gprof.
In terms of kernel profiling, it's not nonsensical to say that
since we
support ARM and x86 in tprof, we can cover 99% of the MI parts of
whatever architecture. From then on, being able to profile the
kernel on
other architectures has very little interest.

Speaking realistically, probably all the recent software-based kernel
profiling was done with DTrace.

Yes. So I will proceed.

Note that the removal of the kernel gprof implies the removal of kgmon.

Just checking:  How will it work for ports like vax?
When searching for bottlenecks I normally use gprof/kgmon.  I don't know
anything about DTrace, hence the question.

-- Ragge

There is no support of DTrace for vax and probably there won't be one.
Also probably DTrace is not a final solution per se (DTrace is described
as step backwards by people such as Brendan Gregg).. but we are working
on better toolchain support to open more possibilities such as XRay.

Regarding vax there might be bottlenecks in MD code, but DTrace is a
decent one for MI code on supported ports.

Hm, so this means that we will be without kernel profiling support at
all on non-DTrace architectures?
I'm not too happy about that by obvious reasons.

It do not work to profile code paths on other architectures, since what
takes time is very different.
And yes, it is not the MD code that is the case, it's the MI code.

I may have missed something, but why remove something that works without
replacing it with something new?
Only have profiling on a few ports do not sound very clever to me.

-- Ragge



Evaluating this situation we have to be aware that this description
could be reversed and there are ports without meaningful (or any) gprof
support.

Observing that all the useful profiling is already done with DTrace, we
can remove complexity from the kernel with negligible cost.

This is not true.  Things that you will never notice is a problem on x86 
may kill a vax,
since there is a large speed factor inbetween.  This was true many years 
ago and is still true.


Bottom line:  I think it is a bad idea to be without kernel profiling 
code on vax.


-- Ragge


Re: Too many PMC implementations

2018-08-23 Thread Anders Magnusson

Den 2018-08-23 kl. 17:03, skrev Maxime Villard:

Le 23/08/2018 à 16:28, Anders Magnusson a écrit :

Den 2018-08-23 kl. 15:53, skrev Maxime Villard:

Le 17/08/2018 à 17:42, Kamil Rytarowski a écrit :

On 17.08.2018 17:13, Maxime Villard wrote:
Note that I'm talking about the kernel gprof, and not the userland 
gprof.
In terms of kernel profiling, it's not nonsensical to say that 
since we

support ARM and x86 in tprof, we can cover 99% of the MI parts of
whatever architecture. From then on, being able to profile the 
kernel on

other architectures has very little interest.


Speaking realistically, probably all the recent software-based kernel
profiling was done with DTrace.


Yes. So I will proceed.

Note that the removal of the kernel gprof implies the removal of kgmon.

Just checking:  How will it work for ports like vax?
When searching for bottlenecks I normally use gprof/kgmon.  I don't know
anything about DTrace, hence the question.


It looks like there will be no replacement. Are you sure this is really
kgmon? Because as far as I can tell, in many architectures GPROF is just
dead code that either doesn't compile or doesn't have effect (missing
opt_gprof.h, but I did add it in February of this year in the MI parts,
so it was likely even more broken before).
I have used it not long ago for vax.  Maybe I did have to do some 
tweaks, do not remember,

but I really want to be able to use kernel profiling on vax.

So, I really oppose removing it and leaving vax without any kernel 
profiling choice.


-- Ragge



Re: Too many PMC implementations

2018-08-23 Thread Anders Magnusson

Den 2018-08-23 kl. 16:48, skrev Kamil Rytarowski:

On 23.08.2018 16:28, Anders Magnusson wrote:

Den 2018-08-23 kl. 15:53, skrev Maxime Villard:

Le 17/08/2018 à 17:42, Kamil Rytarowski a écrit :

On 17.08.2018 17:13, Maxime Villard wrote:

Note that I'm talking about the kernel gprof, and not the userland
gprof.
In terms of kernel profiling, it's not nonsensical to say that since we
support ARM and x86 in tprof, we can cover 99% of the MI parts of
whatever architecture. From then on, being able to profile the
kernel on
other architectures has very little interest.

Speaking realistically, probably all the recent software-based kernel
profiling was done with DTrace.

Yes. So I will proceed.

Note that the removal of the kernel gprof implies the removal of kgmon.

Just checking:  How will it work for ports like vax?
When searching for bottlenecks I normally use gprof/kgmon.  I don't know
anything about DTrace, hence the question.

-- Ragge

There is no support of DTrace for vax and probably there won't be one.
Also probably DTrace is not a final solution per se (DTrace is described
as step backwards by people such as Brendan Gregg).. but we are working
on better toolchain support to open more possibilities such as XRay.

Regarding vax there might be bottlenecks in MD code, but DTrace is a
decent one for MI code on supported ports.
Hm, so this means that we will be without kernel profiling support at 
all on non-DTrace architectures?

I'm not too happy about that by obvious reasons.

It do not work to profile code paths on other architectures, since what 
takes time is very different.

And yes, it is not the MD code that is the case, it's the MI code.

I may have missed something, but why remove something that works without 
replacing it with something new?

Only have profiling on a few ports do not sound very clever to me.

-- Ragge




Re: Too many PMC implementations

2018-08-23 Thread Anders Magnusson

Den 2018-08-23 kl. 15:53, skrev Maxime Villard:

Le 17/08/2018 à 17:42, Kamil Rytarowski a écrit :

On 17.08.2018 17:13, Maxime Villard wrote:
Note that I'm talking about the kernel gprof, and not the userland 
gprof.

In terms of kernel profiling, it's not nonsensical to say that since we
support ARM and x86 in tprof, we can cover 99% of the MI parts of
whatever architecture. From then on, being able to profile the 
kernel on

other architectures has very little interest.


Speaking realistically, probably all the recent software-based kernel
profiling was done with DTrace.


Yes. So I will proceed.

Note that the removal of the kernel gprof implies the removal of kgmon.

Just checking:  How will it work for ports like vax?
When searching for bottlenecks I normally use gprof/kgmon.  I don't know 
anything about DTrace, hence the question.


-- Ragge


Re: Kernel module framework status?

2018-05-02 Thread Anders Magnusson

Den 2018-05-02 kl. 10:37, skrev Paul Goyette:
I'm trying to find some documentation of the status of the kernel 
modules, but only finds some scattered postings.

What is done, what is left, are there any decision points etc...?


Anders,

You might start by looking at src/doc/TODO.modules

Doh!  Of course, thanks!

-- Ragge


Kernel module framework status?

2018-05-02 Thread Anders Magnusson

Hi all,

I'm trying to find some documentation of the status of the kernel 
modules, but only finds some scattered postings.

What is done, what is left, are there any decision points etc...?

-- Ragge



Re: mmap implementation advice needed.

2018-03-31 Thread Anders Magnusson

Den 2018-03-30 kl. 22:31, skrev Joerg Sonnenberger:

On Fri, Mar 30, 2018 at 04:22:29PM -0400, Mouse wrote:

And I (and ragge, I think it was) misspoke.  It doesn't quite require
128K of contiguous physical space.  It needs two 64K blocks of
physically contiguous space, both within the block that maps system
space.  (Nothing says that P0 PTEs have to be anywhere near P1 PTEs in
system virtual space, but they do have to be within system space.)

...and the problem to be solved here is that the memory has become
fragmented enough that you can't find 64KB of contiguous pages?
If so, what about having a fixed set of emergency reservations and
copying the non-contiguous pmap content into that during context switch?
It's not only contiguous memory that is the problem;  the memory must be 
in the system page table, which place and size is determined at boot.


The usrptmap should (in an ideal world) be sized depending on available 
user memory and maxusers.
Until then, we'll live with these limits.  Which is not a problem on 
vax, only want to avoid unexpected hangs and crashes.


-- Ragge


Re: mmap implementation advice needed.

2018-03-31 Thread Anders Magnusson

Den 2018-03-30 kl. 20:43, skrev matthew green:

A resource limit for mmap in total would solve the problem though.

RLIMIT_AS?  you'll have to add support to set it in MD code,
but eg. these lines should help.

465:uvm_init_limits(struct proc *p)
[..]
479:p->p_rlimit[RLIMIT_AS].rlim_cur = RLIM_INFINITY;
480:p->p_rlimit[RLIMIT_AS].rlim_max = RLIM_INFINITY;


Thanks!  This was exactly what I wanted!  Problem solved!

Hm, why didn't I see it myself when looking? :-)

-- R


Re: mmap implementation advice needed.

2018-03-30 Thread Anders Magnusson

Den 2018-03-30 kl. 16:46, skrev Christos Zoulas:

In article <1ce8eac5-3639-aec1-0e0c-fe857f49b...@ludd.ltu.se>,
Anders Magnusson  <ra...@ludd.ltu.se> wrote:

Hi tech-kern,

I'm trying to solve PR#28379 and ran into a problem and I don't really
understand how it is supposed to work:
If a process tries to mmap for example a file with a length of just over
1GB it will always succeed as I understand the code, but that may not be
true depending on the underlying hardware, and I cannot find any way to
control this from the MD code...?

On vax, for example, large mmap's cannot be done due to hardware
constraints.
In the above example it will cause the mmap() to succeed, but when
touching the pages it will hang forever since there will never be
available pte's.

So, any advice how a max size of allowed mmap'able memory be controlled?

Notes about vax memory management if someone is wondering:
- 2 areas (P0 and P1) of size 1G each, P0 grows from bottom, P1 grows

>from top (intended for stack).

- The PTEs for KVM must be in contiguous physical memory, hence the
allocation for one process with all of P0 and P1 mapped takes 128k.
- Vax uses VM_MAP_TOPDOWN so that not too much of KVM space is needed
for mmap.

Perhaps we should add a resource limit for contiguous memory allocations.
RLIMIT_MEMCONT?  The actual value can be MD.

That will not solve the problem;  just doing two mmap'is and we were at 
the same spot again.

The problem is that too much virtual memory can be allocated.

A resource limit for mmap in total would solve the problem though.

-- Ragge




mmap implementation advice needed.

2018-03-30 Thread Anders Magnusson

Hi tech-kern,

I'm trying to solve PR#28379 and ran into a problem and I don't really 
understand how it is supposed to work:
If a process tries to mmap for example a file with a length of just over 
1GB it will always succeed as I understand the code, but that may not be 
true depending on the underlying hardware, and I cannot find any way to 
control this from the MD code...?


On vax, for example, large mmap's cannot be done due to hardware 
constraints.
In the above example it will cause the mmap() to succeed, but when 
touching the pages it will hang forever since there will never be 
available pte's.


So, any advice how a max size of allowed mmap'able memory be controlled?

Notes about vax memory management if someone is wondering:
- 2 areas (P0 and P1) of size 1G each, P0 grows from bottom, P1 grows 
from top (intended for stack).
- The PTEs for KVM must be in contiguous physical memory, hence the 
allocation for one process with all of P0 and P1 mapped takes 128k.
- Vax uses VM_MAP_TOPDOWN so that not too much of KVM space is needed 
for mmap.


-- Ragge


Re: /dev/ksyms permissions

2018-01-17 Thread Anders Magnusson

Den 2018-01-17 kl. 20:20, skrev Mouse:

Maybe group kmem read, but that might require more elevated
privileges in the programs that uses ksyms.

What program uses ksyms now that doesn't require at least group kmem?

You cannot give up kmem read privileges when calling ksyms read
routines.

I don't see why not - or, at least, I don't see the ksyms change as
being relevant.  Just read /dev/ksyms at startup (at the same time as
you open /dev/kmem, probably), before dropping group kmem.  Isn't that
all this change (making /dev/ksyms 440 root:kmem) requires?

You still have to call library functions with elevated privileges compared
with today.  May not be a big problem, but the code should be audited first
and this behaviour documented.

-- Ragge


Re: /dev/ksyms permissions

2018-01-17 Thread Anders Magnusson

Den 2018-01-17 kl. 20:03, skrev Mouse:



Maybe group kmem read, but that might require more elevated
privileges in the programs that uses ksyms.

What program uses ksyms now that doesn't require at least group kmem?


You cannot give up kmem read privileges when calling ksyms read routines.
Think setegid().

-- Ragge


Re: /dev/ksyms permissions

2018-01-17 Thread Anders Magnusson
libkvm uses it to get the kernel symbol namelist instead of reading 
/netbsd for it (originally kvmdb, which was retired when ksyms was added).
Programs like ps, netstat etc... uses it to find in-kernel stuff, so you 
cannot change it to require root privs to be read.
Maybe group kmem read, but that might require more elevated privileges 
in the programs that uses ksyms.


-- Ragge

Den 2018-01-17 kl. 16:25, skrev co...@sdf.org:

This leaks information that unprivileged user probably has no reason to
own:


cat /dev/ksyms > ksyms
readelf -a ksyms |wc -l

47594

Any strong reason not to apply the following?
Presumably it will have benefits for GENERIC_KASLR, or people with
Intel CPUs :-)




Re: Removing ARCNET stuffs

2015-06-08 Thread Anders Magnusson

Andrew Cagney skrev den 2015-06-08 19:18:

I'm clearly out-of-date regarding SSA, its nice to be corrected.

No problem :-)


On 8 June 2015 at 09:06, Anders Magnusson ra...@ludd.ltu.se wrote:

Andrew Cagney skrev den 2015-06-01 20:41:
I do not understand why either of those choices need to be taken.
Pcc has a reasonable intermediate representation, which in the optimizer
is converted to SSA form, hammered on, and converted back.  This is
done while retaining the intermediate representation, which is no problem.

I'm being fast and loose.  My reading of the code was that debug info
was being generated by the back of the front end (very roughly
gimplify in this diagram of GCC
https://gcc.gnu.org/projects/tree-ssa/#ssa).   It was pretty much hard
wired printfs, and explained to me why -g -O wasn't supported.

printf's are only used for data (which is spit out directly), not code.
All code is dealt with function-wise, otherwise things like the register
allocator would not work.


Unless, I guess, what you're talking about is throwing away the
existing backend entirely and writing a new SSA-based one, in which
case I'd gently suggest that this is a large project :-)

Exactly :-(


What is wrong with the existing backend?

As you state, the representation gets taken into and then out of SSA.
Why bother; at least for the converting to-ssa side?


The rest of the compiler works on the (quite simple) internal representation
which is very easy to deal with.  The SSA conversion code is only a 
minor part
of the compiler backend (and somewhat complex) so it is better to keep 
it separate.
Also, going directly to SSA in pass1 would require all different 
language frontends to

have knowledge about it, which is unneccessary.

Basic steps in the backend are:
- Delete redundant jumps (simplifies SSA)
- SSA conversion,  (optim), remove phi nodes
- Assign instructions
- Allocate registers
- Emit code

Turning off optimizations basically just skips the first two steps :-)

-- Ragge


Re: Removing ARCNET stuffs

2015-06-08 Thread Anders Magnusson

David Holland skrev den 2015-06-03 07:56:


   PCC, to the best of my knowledge is still in the [very early] planning
   stages.  One of its design choices would be to go pure SSA.  Another
   option, closer to GCC (RTL), would be to retain existing code-gen
   passes.  Tough choices.

I'm not sure why it's such a big deal, except that everything in gcc
is a big deal because gcc is such a mess inside.

Moving an existing representation to SSA just requires adding phi
nodes to the representation, writing a check pass to enforce the
single assignment property, and updating existing passes to maintain
the property -- in a language without proper algebraic data types this
is a bigger deal than otherwise but it does not seem like a
particularly major undertaking. Unless the representation is totally
wrong in other ways that need to be rectified first.

SSA representation was added to pcc in 2008.

backend is an orthogonal issue.
  
   I'm being fast and loose.  My reading of the code was that debug info
   was being generated by the back of the front end (very roughly
   gimplify in this diagram of GCC
   https://gcc.gnu.org/projects/tree-ssa/#ssa).   It was pretty much hard
   wired printfs, and explained to me why -g -O wasn't supported.

printfing from the back of the front end is definitely totally wrong
in other ways that need to be rectified first :(

Hm, I may be missing something, but what is wrong?
Where should you print it out otherwise?

-- Ragge


Re: Removing ARCNET stuffs

2015-06-08 Thread Anders Magnusson

Andrew Cagney skrev den 2015-06-01 17:41:

On 1 June 2015 at 02:15, Anders Magnusson ra...@ludd.ltu.se wrote:

Andrew Cagney skrev den 2015-06-01 03:24:

systems and generates reasonable code.  Unfortunately, and sorry PCC
(stabs, really?),

Feel free to add dwarf, the source is out there, and it wouldn't be
especially difficult to do it.  I just haven't had time.
Stabs was for free :-)

I'm not so sure (a year back I looked at the code with that in mind),
and wonder if any quick hack would end up being opportunity lost.

PCC, as a classic C compiler, only generates debug information at
-O0.  This this is because the stabs code is restricted to the
un-optimized code generator path.  Having the backend restricted to
DWARF when '-O0' might just be ok, were it not for SSA (static single
assignment).

I beg to differ.
- There is no requirement to not use optimizations when debugging, and 
has not been so in recent years.
- Why should it be any correlation to the debug format when using SSA 
representation or not?

Just the SSA side of things (if debugging is ignored) is a lot of work
(LLVM's solution was to largely ignore debugging.  I once asked
Lattner directly about this and he answered that he considered it a
back-end problem).

I must admit that I do not understand what would be the problem here.
The SSA conversion (usually) only converts and moves stuff, which will
not affect the debug info more than marginally.

-- Ragge


Re: Removing ARCNET stuffs

2015-06-08 Thread Anders Magnusson

Andrew Cagney skrev den 2015-06-01 22:50:

Like I mentioned in another reply, I'm being a little fast and loose.

The file cc/ccom/scan.l from
http://pcc.ludd.ltu.se/fisheye/browse/pcc/pcc/cc/ccom/scan.l?r=1.127
which I'm assuming is the C parser is doing this:

#define STABS_LINE(x) if (gflag  cftnsp) stabs_line(x)
...
\n{ ++lineno; STABS_LINE(lineno); }

which, I believe, is executing this:

 cprint(1, \t.stabn %d,0,%d, STABLBL \n STABLBL :\n,
 N_SLINE, line, stablbl, stablbl);

and this will will insert the stab into the assembler stream
(send_passt).  However, and here's the key thing, as best I can tell,
the stab isn't tied to an instruction, just a position in the
instruction stream.

It is tied to a statement (which is the only thing known at that time) in
the statement stream.

If an optimizer so much as sneezes, re-ordering instructions for
instance, the information is wrong.

Not necessarily, since the optimizer is quite aware about the existance
of debug info,  and for stabs this is not a problem since it is 
statement-based
and will always put the line label at the correct place regarding 
statements.

If instructions between statements are interleaved we will always loose.

When the back-end goes to generate assembler, all it has, is a string.

Which is sufficient in this case.  Another better way to do it would be to
tie the debug info to the statement struct, but it has not been any reason
to do that for stabs (which works well enough for normal debugging).


In the case of variables.  They are generated at the start of a block,
so have no connection at all to the code.

perhaps I'm wrong?


They can be declared anywhere in the code, but nevertheless it do not matter
where as long as they refer to the correct function.
Note that the backend do not care about language, for example the f77 
compiler
is included in the package.  I haven't moved over the pascal frontend 
yet though :-)


-- Ragge


Re: Removing ARCNET stuffs

2015-06-08 Thread Anders Magnusson

Andrew Cagney skrev den 2015-06-01 20:41:

On 1 June 2015 at 12:54, David Holland dholland-t...@netbsd.org wrote:

On Mon, Jun 01, 2015 at 11:41:38AM -0400, Andrew Cagney wrote:
systems and generates reasonable code.  Unfortunately, and sorry PCC
(stabs, really?),
   
Feel free to add dwarf, the source is out there, and it wouldn't be
especially difficult to do it.  I just haven't had time.
Stabs was for free :-)
  
   I'm not so sure (a year back I looked at the code with that in mind),
   and wonder if any quick hack would end up being opportunity lost.

I have not looked at it, nor have I looked at pcc at all in a long
time, so what I'm missing may just be otherwise obvious context, but:

   PCC, as a classic C compiler, only generates debug information at
   -O0.  This this is because the stabs code is restricted to the
   un-optimized code generator path.  Having the backend restricted to
   DWARF when '-O0' might just be ok, were it not for SSA (static single
   assignment).
  
   To my mind, and I'm assuming a pure SSA compiler design, having SSA
   forces issues like: [...]

I'm missing something; SSA is just a style of program representation.

Yes.  Lets think of Static Single Assignment as the pure academic theory.

LLVM[Lattner et.al.] and GIMPLE[Novillo et.al.] are real world
implementations of that theory.
https://gcc.gnu.org/projects/tree-ssa/#ssa has a good diagram and is a
relevant read.

PCC, to the best of my knowledge is still in the [very early] planning
stages.  One of its design choices would be to go pure SSA.  Another
option, closer to GCC (RTL), would be to retain existing code-gen
passes.  Tough choices.

I do not understand why either of those choices need to be taken.
Pcc has a reasonable intermediate representation, which in the optimizer
is converted to SSA form, hammered on, and converted back.  This is
done while retaining the intermediate representation, which is no problem.

I'm being fast and loose.  My reading of the code was that debug info
was being generated by the back of the front end (very roughly
gimplify in this diagram of GCC
https://gcc.gnu.org/projects/tree-ssa/#ssa).   It was pretty much hard
wired printfs, and explained to me why -g -O wasn't supported.

printf's are only used for data (which is spit out directly), not code.
All code is dealt with function-wise, otherwise things like the register
allocator would not work.


Unless, I guess, what you're talking about is throwing away the
existing backend entirely and writing a new SSA-based one, in which
case I'd gently suggest that this is a large project :-)

Exactly :-(


What is wrong with the existing backend?

-- Ragge



Re: Removing ARCNET stuffs

2015-06-01 Thread Anders Magnusson

Andrew Cagney skrev den 2015-06-01 03:24:

systems and generates reasonable code.  Unfortunately, and sorry PCC
(stabs, really?),
Feel free to add dwarf, the source is out there, and it wouldn't be 
especially difficult to do it.  I just haven't had time.

Stabs was for free :-)

-- Ragge



Re: asymmetric smp

2014-04-02 Thread Anders Magnusson

Martin Husemann skrev 2014-04-02 15:33:

On Wed, Apr 02, 2014 at 03:13:19PM +0200, Johnny Billquist wrote:

What model of VAX do you have, and how long does it take to boot, to the
point where you get the login prompt on the console?

VS4000/M96 with 128 MB, and local scsi disk - very nice machine ;-)
Didn't measure exactly right now, but on the order of 30s.


Heh, that machines is like 30 times faster than Johnny's :-)

-- R


Re: Closing a serial device takes one second

2014-02-07 Thread Anders Magnusson

Michael van Elst skrev 2014-02-07 10:40:

On the other hand, serial printers are mostly a thing of the past.
Not at all, they are really common in special systems, like receipt 
printers in stores.


-- Ragge


Re: A simple cpufreq(9)

2011-09-29 Thread Anders Magnusson

On 09/29/2011 02:50 PM, Mouse wrote:

The cache and mmu are probably harder than the cpu :-)
 

I'm not sure the PDP-10 even _had_ cache; I'd have to do some digging
on that score.  And I have no idea what it had for an MMU.  The only
non-power-of-two-word-size machine I've ever actually used, as far as I
can recall, was a PDP-8.  I'm interested in NetBSD/pdp10 less for
personal nostalgia value than for the code cleanup it would enforce.
   

I did a basic implementation of PDP10 instruction set in Verilog a few
years ago for a Xilinx FPGA, using only on-chip memory.  which made
it easy.  The tricky part is the memory management which is quite odd
on all PDP10 CPUs (and totally different between them!).

-- Ragge


Vax interlock insn's (Re: Please do not yell at people for trying to help you.)

2010-11-13 Thread Anders Magnusson

(a little side-note, but may be interesting)

On 11/13/2010 04:17 AM, Matt Thomas wrote:

Eventually, most operations come down to compare and swap.  It's just too
damn useful to not have as a primitive.  Even if some of the platforms
have to emulate it somehow.  Just because an architecture is 30+ years old
doesn't mean we are forced to ignore algorithms that came after its birth.

The VAX now has a fast non-MP emulation of atomic_cas so that should be
less of an issue.
   

When I tested NetBSD/vax MP on an 8353 some years ago I found that
it ended up unexpectedly slow in some situations.  Some timing showed
that the interlock insn's were _really_ slow compared to their 
non-interlocked

version; insqti almost took 100 times of insque, bbssi were not that slow
compared to bbss but there were an unreasonable amount of time spent
on the buses.

Using an MP VAX were just not worth it (at least not with BI bus).  Better
to optimize for UP VAX and make it work reasonable good.

-- Ragge