Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-15 Thread Giacomo Tesio
Il giorno dom 14 ott 2018 alle ore 19:39 Ole-Hjalmar Kristensen
 ha scritto:
>
> OK, that makes sense. So it would not stop a client from for example first 
> read an index block in a B-tree, wait for the result, and then issue read 
> operations for all the data blocks in parallel.

If the client is the kernel that's true.
If the client is directly speaking 9P that's true again.

But if the client is a userspace program using pread/pwrite that
wouldn't work unless it fork a new process per each read as the
syscalls blocks.
Which is what fcp does, actually:
https://github.com/brho/plan9/blob/master/sys/src/cmd/fcp.c


Giacomo



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-14 Thread hiro
also read what has been written before about fcp. and read the source of fcp.

On 10/14/18, Ole-Hjalmar Kristensen  wrote:
> OK, that makes sense. So it would not stop a client from for example first
> read an index block in a B-tree, wait for the result, and then issue read
> operations for all the data blocks in parallel. That's exactly the same as
> any asynchronous disk subsystem I am acquainted with. Reordering is the
> norm.
>
> On Sun, Oct 14, 2018 at 1:21 PM hiro <23h...@gmail.com> wrote:
>
>> there's no tyranny involved.
>>
>> a client that is fine with the *responses* coming in reordered could
>> remember the tag obviously and do whatever you imagine.
>>
>> the problem is potential reordering of the messages in the kernel
>> before responding, even if the 9p transport has guaranteed ordering.
>>
>> On 10/14/18, Ole-Hjalmar Kristensen 
>> wrote:
>> > I'm not going to argue with someone who has got his hands dirty by
>> actually
>> > doing this but I don't really get this about the tyranny of 9p. Isn't
>> > the
>> > point of the tag field to identify the request? What is stopping the
>> client
>> > from issuing multiple requests and match the replies based on the tag?
>> From
>> > the manual:
>> >
>> > Each T-message has a tag field, chosen and used by the
>> >   client to identify the message.  The reply to the message
>> >   will have the same tag.  Clients must arrange that no two
>> >   outstanding messages on the same connection have the same
>> >   tag.  An exception is the tag NOTAG, defined as (ushort)~0
>> >   in : the client can use it, when establishing a
>> >   connection, to override tag matching in version messages.
>> >
>> >
>> >
>> > Den ons. 10. okt. 2018, 23.56 skrev Steven Stallion
>> > > >:
>> >
>> >> As the guy who wrote the majority of the code that pushed those 1M 4K
>> >> random IOPS erik mentioned, this thread annoys the shit out of me. You
>> >> don't get an award for writing a driver. In fact, it's probably better
>> >> not to be known at all considering the bloody murder one has to commit
>> >> to marry hardware and software together.
>> >>
>> >> Let's be frank, the I/O handling in the kernel is anachronistic. To
>> >> hit those rates, I had to add support for asynchronous and vectored
>> >> I/O not to mention a sizable bit of work by a co-worker to properly
>> >> handle NUMA on our appliances to hit those speeds. As I recall, we had
>> >> to rewrite the scheduler and re-implement locking, which even Charles
>> >> Forsyth had a hand in. Had we the time and resources to implement
>> >> something like zero-copy we'd have done it in a heartbeat.
>> >>
>> >> In the end, it doesn't matter how "fast" a storage driver is in Plan 9
>> >> - as soon as you put a 9P-based filesystem on it, it's going to be
>> >> limited to a single outstanding operation. This is the tyranny of 9P.
>> >> We (Coraid) got around this by avoiding filesystems altogether.
>> >>
>> >> Go solve that problem first.
>> >> On Wed, Oct 10, 2018 at 12:36 PM  wrote:
>> >> >
>> >> > > But the reason I want this is to reduce latency to the first
>> >> > > access, especially for very large files. With read() I have
>> >> > > to wait until the read completes. With mmap() processing can
>> >> > > start much earlier and can be interleaved with background
>> >> > > data fetch or prefetch. With read() a lot more resources
>> >> > > are tied down. If I need random access and don't need to
>> >> > > read all of the data, the application has to do pread(),
>> >> > > pwrite() a lot thus complicating it. With mmap() I can just
>> >> > > map in the whole file and excess reading (beyond what the
>> >> > > app needs) will not be a large fraction.
>> >> >
>> >> > you think doing single 4K page sized reads in the pagefault
>> >> > handler is better than doing precise >4K reads from your
>> >> > application? possibly in a background thread so you can
>> >> > overlap processing with data fetching?
>> >> >
>> >> > the advantage of mmap is not prefetch. its about not to do
>> >> > any I/O when data is already in the *SHARED* buffer cache!
>> >> > which plan9 does not have (except the mntcache, but that is
>> >> > optional and only works for the disk fileservers that maintain
>> >> > ther file qid ver info consistently). its *IS* really a linux
>> >> > thing where all block device i/o goes thru the buffer cache.
>> >> >
>> >> > --
>> >> > cinap
>> >> >
>> >>
>> >>
>> >
>>
>>
>



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-14 Thread Ole-Hjalmar Kristensen
OK, that makes sense. So it would not stop a client from for example first
read an index block in a B-tree, wait for the result, and then issue read
operations for all the data blocks in parallel. That's exactly the same as
any asynchronous disk subsystem I am acquainted with. Reordering is the
norm.

On Sun, Oct 14, 2018 at 1:21 PM hiro <23h...@gmail.com> wrote:

> there's no tyranny involved.
>
> a client that is fine with the *responses* coming in reordered could
> remember the tag obviously and do whatever you imagine.
>
> the problem is potential reordering of the messages in the kernel
> before responding, even if the 9p transport has guaranteed ordering.
>
> On 10/14/18, Ole-Hjalmar Kristensen 
> wrote:
> > I'm not going to argue with someone who has got his hands dirty by
> actually
> > doing this but I don't really get this about the tyranny of 9p. Isn't the
> > point of the tag field to identify the request? What is stopping the
> client
> > from issuing multiple requests and match the replies based on the tag?
> From
> > the manual:
> >
> > Each T-message has a tag field, chosen and used by the
> >   client to identify the message.  The reply to the message
> >   will have the same tag.  Clients must arrange that no two
> >   outstanding messages on the same connection have the same
> >   tag.  An exception is the tag NOTAG, defined as (ushort)~0
> >   in : the client can use it, when establishing a
> >   connection, to override tag matching in version messages.
> >
> >
> >
> > Den ons. 10. okt. 2018, 23.56 skrev Steven Stallion  >:
> >
> >> As the guy who wrote the majority of the code that pushed those 1M 4K
> >> random IOPS erik mentioned, this thread annoys the shit out of me. You
> >> don't get an award for writing a driver. In fact, it's probably better
> >> not to be known at all considering the bloody murder one has to commit
> >> to marry hardware and software together.
> >>
> >> Let's be frank, the I/O handling in the kernel is anachronistic. To
> >> hit those rates, I had to add support for asynchronous and vectored
> >> I/O not to mention a sizable bit of work by a co-worker to properly
> >> handle NUMA on our appliances to hit those speeds. As I recall, we had
> >> to rewrite the scheduler and re-implement locking, which even Charles
> >> Forsyth had a hand in. Had we the time and resources to implement
> >> something like zero-copy we'd have done it in a heartbeat.
> >>
> >> In the end, it doesn't matter how "fast" a storage driver is in Plan 9
> >> - as soon as you put a 9P-based filesystem on it, it's going to be
> >> limited to a single outstanding operation. This is the tyranny of 9P.
> >> We (Coraid) got around this by avoiding filesystems altogether.
> >>
> >> Go solve that problem first.
> >> On Wed, Oct 10, 2018 at 12:36 PM  wrote:
> >> >
> >> > > But the reason I want this is to reduce latency to the first
> >> > > access, especially for very large files. With read() I have
> >> > > to wait until the read completes. With mmap() processing can
> >> > > start much earlier and can be interleaved with background
> >> > > data fetch or prefetch. With read() a lot more resources
> >> > > are tied down. If I need random access and don't need to
> >> > > read all of the data, the application has to do pread(),
> >> > > pwrite() a lot thus complicating it. With mmap() I can just
> >> > > map in the whole file and excess reading (beyond what the
> >> > > app needs) will not be a large fraction.
> >> >
> >> > you think doing single 4K page sized reads in the pagefault
> >> > handler is better than doing precise >4K reads from your
> >> > application? possibly in a background thread so you can
> >> > overlap processing with data fetching?
> >> >
> >> > the advantage of mmap is not prefetch. its about not to do
> >> > any I/O when data is already in the *SHARED* buffer cache!
> >> > which plan9 does not have (except the mntcache, but that is
> >> > optional and only works for the disk fileservers that maintain
> >> > ther file qid ver info consistently). its *IS* really a linux
> >> > thing where all block device i/o goes thru the buffer cache.
> >> >
> >> > --
> >> > cinap
> >> >
> >>
> >>
> >
>
>


Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-14 Thread hiro
there's no tyranny involved.

a client that is fine with the *responses* coming in reordered could
remember the tag obviously and do whatever you imagine.

the problem is potential reordering of the messages in the kernel
before responding, even if the 9p transport has guaranteed ordering.

On 10/14/18, Ole-Hjalmar Kristensen  wrote:
> I'm not going to argue with someone who has got his hands dirty by actually
> doing this but I don't really get this about the tyranny of 9p. Isn't the
> point of the tag field to identify the request? What is stopping the client
> from issuing multiple requests and match the replies based on the tag? From
> the manual:
>
> Each T-message has a tag field, chosen and used by the
>   client to identify the message.  The reply to the message
>   will have the same tag.  Clients must arrange that no two
>   outstanding messages on the same connection have the same
>   tag.  An exception is the tag NOTAG, defined as (ushort)~0
>   in : the client can use it, when establishing a
>   connection, to override tag matching in version messages.
>
>
>
> Den ons. 10. okt. 2018, 23.56 skrev Steven Stallion :
>
>> As the guy who wrote the majority of the code that pushed those 1M 4K
>> random IOPS erik mentioned, this thread annoys the shit out of me. You
>> don't get an award for writing a driver. In fact, it's probably better
>> not to be known at all considering the bloody murder one has to commit
>> to marry hardware and software together.
>>
>> Let's be frank, the I/O handling in the kernel is anachronistic. To
>> hit those rates, I had to add support for asynchronous and vectored
>> I/O not to mention a sizable bit of work by a co-worker to properly
>> handle NUMA on our appliances to hit those speeds. As I recall, we had
>> to rewrite the scheduler and re-implement locking, which even Charles
>> Forsyth had a hand in. Had we the time and resources to implement
>> something like zero-copy we'd have done it in a heartbeat.
>>
>> In the end, it doesn't matter how "fast" a storage driver is in Plan 9
>> - as soon as you put a 9P-based filesystem on it, it's going to be
>> limited to a single outstanding operation. This is the tyranny of 9P.
>> We (Coraid) got around this by avoiding filesystems altogether.
>>
>> Go solve that problem first.
>> On Wed, Oct 10, 2018 at 12:36 PM  wrote:
>> >
>> > > But the reason I want this is to reduce latency to the first
>> > > access, especially for very large files. With read() I have
>> > > to wait until the read completes. With mmap() processing can
>> > > start much earlier and can be interleaved with background
>> > > data fetch or prefetch. With read() a lot more resources
>> > > are tied down. If I need random access and don't need to
>> > > read all of the data, the application has to do pread(),
>> > > pwrite() a lot thus complicating it. With mmap() I can just
>> > > map in the whole file and excess reading (beyond what the
>> > > app needs) will not be a large fraction.
>> >
>> > you think doing single 4K page sized reads in the pagefault
>> > handler is better than doing precise >4K reads from your
>> > application? possibly in a background thread so you can
>> > overlap processing with data fetching?
>> >
>> > the advantage of mmap is not prefetch. its about not to do
>> > any I/O when data is already in the *SHARED* buffer cache!
>> > which plan9 does not have (except the mntcache, but that is
>> > optional and only works for the disk fileservers that maintain
>> > ther file qid ver info consistently). its *IS* really a linux
>> > thing where all block device i/o goes thru the buffer cache.
>> >
>> > --
>> > cinap
>> >
>>
>>
>



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-14 Thread Ole-Hjalmar Kristensen
I'm not going to argue with someone who has got his hands dirty by actually
doing this but I don't really get this about the tyranny of 9p. Isn't the
point of the tag field to identify the request? What is stopping the client
from issuing multiple requests and match the replies based on the tag? From
the manual:

Each T-message has a tag field, chosen and used by the
  client to identify the message.  The reply to the message
  will have the same tag.  Clients must arrange that no two
  outstanding messages on the same connection have the same
  tag.  An exception is the tag NOTAG, defined as (ushort)~0
  in : the client can use it, when establishing a
  connection, to override tag matching in version messages.



Den ons. 10. okt. 2018, 23.56 skrev Steven Stallion :

> As the guy who wrote the majority of the code that pushed those 1M 4K
> random IOPS erik mentioned, this thread annoys the shit out of me. You
> don't get an award for writing a driver. In fact, it's probably better
> not to be known at all considering the bloody murder one has to commit
> to marry hardware and software together.
>
> Let's be frank, the I/O handling in the kernel is anachronistic. To
> hit those rates, I had to add support for asynchronous and vectored
> I/O not to mention a sizable bit of work by a co-worker to properly
> handle NUMA on our appliances to hit those speeds. As I recall, we had
> to rewrite the scheduler and re-implement locking, which even Charles
> Forsyth had a hand in. Had we the time and resources to implement
> something like zero-copy we'd have done it in a heartbeat.
>
> In the end, it doesn't matter how "fast" a storage driver is in Plan 9
> - as soon as you put a 9P-based filesystem on it, it's going to be
> limited to a single outstanding operation. This is the tyranny of 9P.
> We (Coraid) got around this by avoiding filesystems altogether.
>
> Go solve that problem first.
> On Wed, Oct 10, 2018 at 12:36 PM  wrote:
> >
> > > But the reason I want this is to reduce latency to the first
> > > access, especially for very large files. With read() I have
> > > to wait until the read completes. With mmap() processing can
> > > start much earlier and can be interleaved with background
> > > data fetch or prefetch. With read() a lot more resources
> > > are tied down. If I need random access and don't need to
> > > read all of the data, the application has to do pread(),
> > > pwrite() a lot thus complicating it. With mmap() I can just
> > > map in the whole file and excess reading (beyond what the
> > > app needs) will not be a large fraction.
> >
> > you think doing single 4K page sized reads in the pagefault
> > handler is better than doing precise >4K reads from your
> > application? possibly in a background thread so you can
> > overlap processing with data fetching?
> >
> > the advantage of mmap is not prefetch. its about not to do
> > any I/O when data is already in the *SHARED* buffer cache!
> > which plan9 does not have (except the mntcache, but that is
> > optional and only works for the disk fileservers that maintain
> > ther file qid ver info consistently). its *IS* really a linux
> > thing where all block device i/o goes thru the buffer cache.
> >
> > --
> > cinap
> >
>
>


Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-11 Thread Lyndon Nerenberg
Digby R.S. Tarvin writes:

> Oh yes, I read Eldon Halls book on that quite a few years ago. Meetings
> held to discuss competing potential uses for a word of memory that had
> become free.

> That one would be a challenging Plan9 port..

And yet Plan9 was not there to save the day.  Such a pity.



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-11 Thread Digby R.S. Tarvin
Oh yes, I read Eldon Halls book on that quite a few years ago. Meetings
held to discuss competing potential uses for a word of memory that had
become free.

That one would be a challenging Plan9 port..

On Fri, 12 Oct 2018 at 05:13, Lyndon Nerenberg  wrote:

> Digby R.S. Tarvin writes:
>
> > Agreed, but the PDP11/70 was not constrained to 64KB memory either.
>
> > I do recall the MS-DOS small/large/medium etc models that used the
> > segmentation in various ways to mitigate the limitations of being a 16
> bit
> > computer. Similar techniques were possible on the PDP11, for example
>
> Coincidental to this conversation, I'm currently reading "The Apollo
> Guidance Computer: Architecture and Operation" by _Framk O'Brien_.
> (ISBN 978-1-4419-0876-6)  Very interesting to see what you can do with
> a 15 bit architecture when sufficiently motivated.
>
> --lyndon
>
>


Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-11 Thread Lyndon Nerenberg
hiro writes:

> don't you need sending ability, too for AIS?

No, a receive-only setup is very useful on a small boat.  Where I
would like to go with this is to take the decoded AIS data as input
for "ARPA" style collision plots.  I'm interested in the big boats
sailing through the straight.  They can't turn fast, and rarely
change course.  If I can derive their intentions, I can plot a path
between them that requires the least amount of tacking.

The big boats, in turn, have no interest in us little critters.
They actively filter out the "class B" (I think that's the term)
noise that are AIS transmissions from the small craft.  Even if we
hit them, we can't sink them, so they don't care about us.  Therefore
there is no incentive for small boats to transmit AIS.  Unless you're
trying to locate your buddies for a tie-up somewhere.  (That can
be a very valid reason for transmitting!)

--lyndon



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-11 Thread hiro
>> I assumed you were using an RTL2832U (rtlsdr library).
>
> I'm pretty sure they all do, under the hood.
>
>

don't you need sending ability, too for AIS?



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-11 Thread hiro
> need to prove it can be done with the usual
> suspects (GNU radio, on the Pi -- the native fft libraries seem fast
> enought to make this viable).

be assured i've demodulated 25khz signals in real-time and it's a walk
in the park, as long as your revision has the neon stuff i mentioned,
otherwise the fft becomes bottleneck.



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-11 Thread Lyndon Nerenberg
Skip Tavakkolian writes:

> I assumed you were using an RTL2832U (rtlsdr library).

I'm pretty sure they all do, under the hood.



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-11 Thread Skip Tavakkolian
I assumed you were using an RTL2832U (rtlsdr library).

On Thu, Oct 11, 2018, 12:40 PM Lyndon Nerenberg  wrote:

> > I was able to use dump1090 (same author as redis) to get ADSB data
> reliably
> > on RPi/Linux a while back.
>
> I have a pair of Flightbox ADS-B receivers I am using as references.
> While mostly reliable, they can and do stutter along with the rest
> of the alternatives on occasion.
>
> --lyndon
>
>


Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-11 Thread Lyndon Nerenberg
> I was able to use dump1090 (same author as redis) to get ADSB data reliably
> on RPi/Linux a while back.

I have a pair of Flightbox ADS-B receivers I am using as references.
While mostly reliable, they can and do stutter along with the rest
of the alternatives on occasion.

--lyndon



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-11 Thread Lyndon Nerenberg
hiro writes:

> But given the alternatives available back then, even the armv5 in the
> kirkwood, which was cheaper even before the rpi became popular, did
> the same job more stably, which is why i would never actually
> recommend the pi. And there are even more alternatives now.

I get that. But the actual hardware driving this conversation isn't
particularly relevant,, and devolving to a hardware bikeshed isn't
helpful.  (Not picking on you specifically.)

> Are you doing the AIS demodulation on plan9 on rpi? It would be a
> great showcase. Wish I had been given the opportunity to find an
> excuse to build something like that on plan9 instead :)

Not yet.  First I need to prove it can be done with the usual
suspects (GNU radio, on the Pi -- the native fft libraries seem fast
enought to make this viable).  If the pessimized case works, then
porting the code from the GNU radio python modules to C is a
mechanical process for the most part.  This week I am ENOTIME with
getting the boat tarped up in preparation for the winter monsoon
season :-P.

--lyndon



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-11 Thread Skip Tavakkolian
I was able to use dump1090 (same author as redis) to get ADSB data reliably
on RPi/Linux a while back.

On Thu, Oct 11, 2018, 10:54 AM Lyndon Nerenberg  wrote:

> > I have been able to copy 1 GiB/s to userspace from an nvme device. I
> should
> > think a radio should be no problem.
>
> The problem is when you have multiple decoder blocks implemented
> as individual processes (i.e. the GNU radio model).  Once you have
> everything debugged, you can put it into a single threaded process
> and eliminate the copy overhead.  But it's completely impractical
> to prototype or debug real applications this way.  And it's the
> prototyping case I'm interested in here.
>
> So I'm *curious* to know if page flipping a 'protocol buffer' like
> object between processes provides an optimization over copying
> through the kernel.  Not so much for the speed aspect, but to free
> up CPU cycles that can be devoted to actual SDR work.
>
> Since when did curiosity become a capital crime?   Oh, wait, that
> was January 20, 2017.  My bad.
>
> --lyndon
>
>


Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-11 Thread hiro
We also have CPU extensions that can help make fast FFT, because it's
such a generic problem, and in the worst case you can use fpgas,
asics, in any case dedicated hardware.



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-11 Thread hiro
i meant without having to resort to some soft fp.

On 10/11/18, hiro <23h...@gmail.com> wrote:
>> through the kernel.  Not so much for the speed aspect, but to free
>> up CPU cycles that can be devoted to actual SDR work.
>
> those 2x25kHz channels would hardly need many cycles. rather it's just
> a matter of selecting the right CPU that can actually do the FFT with
> some software floating point implementation :)
>
> i don't see memory bandwidth or even random memory access latency
> affecting this scenario in the slightest.
>



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-11 Thread hiro
> through the kernel.  Not so much for the speed aspect, but to free
> up CPU cycles that can be devoted to actual SDR work.

those 2x25kHz channels would hardly need many cycles. rather it's just
a matter of selecting the right CPU that can actually do the FFT with
some software floating point implementation :)

i don't see memory bandwidth or even random memory access latency
affecting this scenario in the slightest.



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-11 Thread hiro
> One example is for an AIS transceiver on a boat.  By putting the
> radio and decoder at the top of the mast, the backhaul can be a
> cat-3 twisted pair cable, rather than a much heavier coax run from
> the antenna at the top of the mast to the receiver below decks.

Yeah, I've been sending 3Mbit I/Q samples over ethernet to a more
beefy computer. For non-technical crowds I described the rpi as a
passable USB->ethernet gateway for SDR tasks in that bandwidth.

But given the alternatives available back then, even the armv5 in the
kirkwood, which was cheaper even before the rpi became popular, did
the same job more stably, which is why i would never actually
recommend the pi. And there are even more alternatives now.

Even the rpi itself is proof that better alternatives exist (as they
did even back then when the first one out), because the newer rpi
revision (i think) has finally gained neon cpu extensions, which
surprisingly have been supported by gnuradio long before this, and a
reason why my bachelor thesis back then was an easy success :)

In general all limits that occured to me on the rpi were due to
stability (usb power and compatibility issues), but more concretely
for our discussion: lack of cpu power, mainly for the FFT. There were
no throughput, delay or memory copy bottlenecks for me.

This was using linux, because my mouse didn't work on the old rpi
plan9 image and sadly there was a time-limit...

Are you doing the AIS demodulation on plan9 on rpi? It would be a
great showcase. Wish I had been given the opportunity to find an
excuse to build something like that on plan9 instead :)



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-11 Thread Lyndon Nerenberg
Digby R.S. Tarvin writes:

> Agreed, but the PDP11/70 was not constrained to 64KB memory either.

> I do recall the MS-DOS small/large/medium etc models that used the
> segmentation in various ways to mitigate the limitations of being a 16 bit
> computer. Similar techniques were possible on the PDP11, for example

Coincidental to this conversation, I'm currently reading "The Apollo
Guidance Computer: Architecture and Operation" by _Framk O'Brien_.
(ISBN 978-1-4419-0876-6)  Very interesting to see what you can do with
a 15 bit architecture when sufficiently motivated.

--lyndon



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-11 Thread Kurt H Maier
On Thu, Oct 11, 2018 at 10:54:22AM -0700, Lyndon Nerenberg wrote:
>
> Since when did curiosity become a capital crime?   Oh, wait, that
> was January 20, 2017.  My bad.

Turns out it's not, so you can climb down off your cross.  It's just
that it helps to be a little clearer about your meaning, that's all.
Otherwise you might do something embarassing, like posting SAS
controller code into an NVMe discussion.

khm



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-11 Thread Lyndon Nerenberg
> I have been able to copy 1 GiB/s to userspace from an nvme device. I should
> think a radio should be no problem.

The problem is when you have multiple decoder blocks implemented
as individual processes (i.e. the GNU radio model).  Once you have
everything debugged, you can put it into a single threaded process
and eliminate the copy overhead.  But it's completely impractical
to prototype or debug real applications this way.  And it's the
prototyping case I'm interested in here.

So I'm *curious* to know if page flipping a 'protocol buffer' like
object between processes provides an optimization over copying
through the kernel.  Not so much for the speed aspect, but to free
up CPU cycles that can be devoted to actual SDR work.

Since when did curiosity become a capital crime?   Oh, wait, that
was January 20, 2017.  My bad.

--lyndon



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-11 Thread Lyndon Nerenberg
hiro writes:

> Does this include demodulation on the pi?

Yes.  At least to a certain extent.  The idea is to get from the
high-birate I/Q data so something more amenable to transmission
over an RS-422 (or -485) serial drop.

One example is for an AIS transceiver on a boat.  By putting the
radio and decoder at the top of the mast, the backhaul can be a
cat-3 twisted pair cable, rather than a much heavier coax run from
the antenna at the top of the mast to the receiver below decks.

Reducing the weight at the top of the mast reduces the moment arm
acting on the boat, significantly enhancing the stability of a
sailboat (which is how I got started down this road to begin with).

--lyndon



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-11 Thread Aram Hăvărneanu
> Posted August 15th, 2013:
>   https://9p.io/sources/contrib/stallion/src/sdmpt2.c Corresponding
> announcement:
>   https://groups.google.com/forum/#!topic/comp.os.plan9/134-YyYnfbQ

This is not a NVMe driver.

-- 
Aram Hăvărneanu



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread Steven Stallion
Interesting - was this ever generalized? It's been several years since
I last looked, but I seem to recall that unless you went out of your
way to write your own 9P implementation, you were limited to a single
tag.
On Wed, Oct 10, 2018 at 7:51 PM Skip Tavakkolian
 wrote:
>
> For operations that matter in this context (read, write), there can be 
> multiple outstanding tags. A while back rsc implemented fcp, partly to prove 
> this point.
>
> On Wed, Oct 10, 2018 at 2:54 PM Steven Stallion  wrote:
>>
>> As the guy who wrote the majority of the code that pushed those 1M 4K
>> random IOPS erik mentioned, this thread annoys the shit out of me. You
>> don't get an award for writing a driver. In fact, it's probably better
>> not to be known at all considering the bloody murder one has to commit
>> to marry hardware and software together.
>>
>> Let's be frank, the I/O handling in the kernel is anachronistic. To
>> hit those rates, I had to add support for asynchronous and vectored
>> I/O not to mention a sizable bit of work by a co-worker to properly
>> handle NUMA on our appliances to hit those speeds. As I recall, we had
>> to rewrite the scheduler and re-implement locking, which even Charles
>> Forsyth had a hand in. Had we the time and resources to implement
>> something like zero-copy we'd have done it in a heartbeat.
>>
>> In the end, it doesn't matter how "fast" a storage driver is in Plan 9
>> - as soon as you put a 9P-based filesystem on it, it's going to be
>> limited to a single outstanding operation. This is the tyranny of 9P.
>> We (Coraid) got around this by avoiding filesystems altogether.
>>
>> Go solve that problem first.
>> On Wed, Oct 10, 2018 at 12:36 PM  wrote:
>> >
>> > > But the reason I want this is to reduce latency to the first
>> > > access, especially for very large files. With read() I have
>> > > to wait until the read completes. With mmap() processing can
>> > > start much earlier and can be interleaved with background
>> > > data fetch or prefetch. With read() a lot more resources
>> > > are tied down. If I need random access and don't need to
>> > > read all of the data, the application has to do pread(),
>> > > pwrite() a lot thus complicating it. With mmap() I can just
>> > > map in the whole file and excess reading (beyond what the
>> > > app needs) will not be a large fraction.
>> >
>> > you think doing single 4K page sized reads in the pagefault
>> > handler is better than doing precise >4K reads from your
>> > application? possibly in a background thread so you can
>> > overlap processing with data fetching?
>> >
>> > the advantage of mmap is not prefetch. its about not to do
>> > any I/O when data is already in the *SHARED* buffer cache!
>> > which plan9 does not have (except the mntcache, but that is
>> > optional and only works for the disk fileservers that maintain
>> > ther file qid ver info consistently). its *IS* really a linux
>> > thing where all block device i/o goes thru the buffer cache.
>> >
>> > --
>> > cinap
>> >
>>



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread Skip Tavakkolian
For operations that matter in this context (read, write), there can be
multiple outstanding tags. A while back rsc implemented fcp, partly to
prove this point.

On Wed, Oct 10, 2018 at 2:54 PM Steven Stallion  wrote:

> As the guy who wrote the majority of the code that pushed those 1M 4K
> random IOPS erik mentioned, this thread annoys the shit out of me. You
> don't get an award for writing a driver. In fact, it's probably better
> not to be known at all considering the bloody murder one has to commit
> to marry hardware and software together.
>
> Let's be frank, the I/O handling in the kernel is anachronistic. To
> hit those rates, I had to add support for asynchronous and vectored
> I/O not to mention a sizable bit of work by a co-worker to properly
> handle NUMA on our appliances to hit those speeds. As I recall, we had
> to rewrite the scheduler and re-implement locking, which even Charles
> Forsyth had a hand in. Had we the time and resources to implement
> something like zero-copy we'd have done it in a heartbeat.
>
> In the end, it doesn't matter how "fast" a storage driver is in Plan 9
> - as soon as you put a 9P-based filesystem on it, it's going to be
> limited to a single outstanding operation. This is the tyranny of 9P.
> We (Coraid) got around this by avoiding filesystems altogether.
>
> Go solve that problem first.
> On Wed, Oct 10, 2018 at 12:36 PM  wrote:
> >
> > > But the reason I want this is to reduce latency to the first
> > > access, especially for very large files. With read() I have
> > > to wait until the read completes. With mmap() processing can
> > > start much earlier and can be interleaved with background
> > > data fetch or prefetch. With read() a lot more resources
> > > are tied down. If I need random access and don't need to
> > > read all of the data, the application has to do pread(),
> > > pwrite() a lot thus complicating it. With mmap() I can just
> > > map in the whole file and excess reading (beyond what the
> > > app needs) will not be a large fraction.
> >
> > you think doing single 4K page sized reads in the pagefault
> > handler is better than doing precise >4K reads from your
> > application? possibly in a background thread so you can
> > overlap processing with data fetching?
> >
> > the advantage of mmap is not prefetch. its about not to do
> > any I/O when data is already in the *SHARED* buffer cache!
> > which plan9 does not have (except the mntcache, but that is
> > optional and only works for the disk fileservers that maintain
> > ther file qid ver info consistently). its *IS* really a linux
> > thing where all block device i/o goes thru the buffer cache.
> >
> > --
> > cinap
> >
>
>


Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread Digby R.S. Tarvin
On Wed, 10 Oct 2018 at 21:40, Ethan Gardener  wrote:

> >
> > Not sure I would agree with that. The 20 bit addressing of the 8086 and
> 8088 did not change their 16 bit nature. They were still 16 bit program
> counter, with segmentation to provide access to a larger memory - similar
> in principle to the PDP11 with MMU.
>
> That's not at all the same as being constrained to 64KB memory.  Are we
> communicating at cross purposes here?  If we're not, if I haven't
> misunderstood you, you might want to read up on creating .exe files for
> MS-DOS.


Agreed, but the PDP11/70 was not constrained to 64KB memory either.

I do recall the MS-DOS small/large/medium etc models that used the
segmentation in various ways to mitigate the limitations of being a 16 bit
computer. Similar techniques were possible on the PDP11, for example
Modula-2/VRS under RT-11 used the MMU to transparently support 4MB programs
back in 1984 (it used trap instructions to implement subroutine calls).

It wasn't possible under Unix, of course, because there were no system
calls for manipulating the mmu. Understandable, as it would have
complicated the security model in a multi-tasking system. Something neither
MS-DOS or RT-11 had to deal with.

Address space manipulation was more convenient with Intel segmentation
because the instruction set included procedure call/return instructions
that manipulated the segmentation registers, but the situation was not
fundamentally different.  They were both 16 bit machines with hacks to give
access to a larger than 64K physical memory.

The OS9 operating system allowed some control of application memory maps in
a unix like environement by supporting dynamic (but explicit) link and
unlink of subroutine and data modules - which would be added and removed
from your 64K address space as required.So more analogous to memory based
overlays.


> > I went Commodore Amiga at about that time - because it at least
> supported some form of multi-tasking out out the box, and I spent many
> happy hours getting OS9 running on it.. An interesting architecture,
> capable of some impressive graphics, but subject to quite severe
> limitations which made general purpose graphics difficult. (Commodore later
> released SVR4 Unix for the A3000, but limited X11 to monochrome when using
> the inbuilt graphics).
>
> It does sound like fun. :)  I'm not surprised by the monochrome graphics
> limitation after my calculations.  Still, X11 or any other window system
> which lacks a backing store may do better in low-memory environments than
> Plan 9's present draw device.  It's a shame, a backing store is a great
> simplification for programmers.
>

X11 does, of course, support the concept of a backing store. It just
doesn't mandate it. It was an expensive thing to provide back when X11 was
young, so pretty rare. I remember finding the need to be able to re-create
windows on demand rather annoying when I first learned to program in Xlib,
but once you get used to it I find it can lead to benefits when you have to
retain a knowledge of how an image is created, not just the end result.


> > But being 32 bit didn't give it a huge advantage over the 16 bit x86
> systems for tinkering with operating system, because the 68000 had no MMU.
> It was easier to get a Unix like system going with 16 bit segmentation than
> a 32 bit linear space and no hardware support for run time relocation.
> > (OS9 used position independent code throughout to work without an MMU,
> but didn't try to implement fork() semantics).
>
> I'm sometimes tempted to think that fork() is freakishly high-level crazy
> stuff. :)  Still, like backing store, it's very nice to have.
>

I agree. Very elegant when you compare it to the hoops you have to jump
through to initialize the child process environment in systems with the
more common combined 'forkexec' semantics, but a real sticking point for
low end hardware.


> > It wasn't till the 68030 based Amiga 3000 came out in 1990 that it
> really did everything I wanted. The 68020 with an optional MMU was
> equivalent, but not so common in consumer machines.
> >
> > Hardware progress seems to have been rather uninteresting since then.
> Sure, hardware is *much* faster and *much* bigger, but fundamentally the
> same architecture. Intel had a brief flirtation with a novel architecture
> with the iAPX 432 in 81, but obviously found that was more profitable
> making the familiar architecture bigger and faster .
>
> I rather agree.  Multi-core and hyperthreading don't bring in much from an
> operating system designer's perspective, and I think all the interesting
> things about caches are means of working around their problems.


I don't think anyone would bother with multiple cores or caches if that
same performance could be achieved without them.  They just buy a bit more
performance at the cost of additional software complexity.

I would very much like to get my hands on a ga144 to see what sort of
> operating system structure 

Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread Steven Stallion
Posted August 15th, 2013: https://9p.io/sources/contrib/stallion/src/sdmpt2.c
Corresponding announcement:
https://groups.google.com/forum/#!topic/comp.os.plan9/134-YyYnfbQ
On Wed, Oct 10, 2018 at 5:31 PM Kurt H Maier  wrote:
>
> On Wed, Oct 10, 2018 at 04:54:22PM -0500, Steven Stallion wrote:
> > As the guy
>
> might be worth keeping in mind the current most common use case for nvme
> is laptop storage and not building jet engines in coraid's basement
>
> so the nvme driver that cinap wrote works on my thinkpad today and is
> about infinity times faster than the one you guys locked up in the
> warehouse at the end of raiders of the lost ark, because my laptop can't
> seem to boot off nostalgia.
>
> so no, nobody gets an award for writing a driver.  but cinap won the
> 9front Order of Valorous Service (with bronze oak leaf cluster,
> signifying working code) for *releasing* one.  I was there when field
> marshal aiju presented the award; it was a very nice ceremony.
>
> anyway, someone once said communication is not a zero-sum game.  the
> hyperspecific use case you describe is fine but there are other reasons
> to care about how well this stuff works, you know?
>
> khm
>



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread Kurt H Maier
On Wed, Oct 10, 2018 at 04:54:22PM -0500, Steven Stallion wrote:
> As the guy 

might be worth keeping in mind the current most common use case for nvme
is laptop storage and not building jet engines in coraid's basement

so the nvme driver that cinap wrote works on my thinkpad today and is 
about infinity times faster than the one you guys locked up in the 
warehouse at the end of raiders of the lost ark, because my laptop can't
seem to boot off nostalgia.

so no, nobody gets an award for writing a driver.  but cinap won the
9front Order of Valorous Service (with bronze oak leaf cluster,
signifying working code) for *releasing* one.  I was there when field
marshal aiju presented the award; it was a very nice ceremony.

anyway, someone once said communication is not a zero-sum game.  the
hyperspecific use case you describe is fine but there are other reasons
to care about how well this stuff works, you know?

khm



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread cinap_lenrek
hahahahahahahaha

--
cinap



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread Steven Stallion
As the guy who wrote the majority of the code that pushed those 1M 4K
random IOPS erik mentioned, this thread annoys the shit out of me. You
don't get an award for writing a driver. In fact, it's probably better
not to be known at all considering the bloody murder one has to commit
to marry hardware and software together.

Let's be frank, the I/O handling in the kernel is anachronistic. To
hit those rates, I had to add support for asynchronous and vectored
I/O not to mention a sizable bit of work by a co-worker to properly
handle NUMA on our appliances to hit those speeds. As I recall, we had
to rewrite the scheduler and re-implement locking, which even Charles
Forsyth had a hand in. Had we the time and resources to implement
something like zero-copy we'd have done it in a heartbeat.

In the end, it doesn't matter how "fast" a storage driver is in Plan 9
- as soon as you put a 9P-based filesystem on it, it's going to be
limited to a single outstanding operation. This is the tyranny of 9P.
We (Coraid) got around this by avoiding filesystems altogether.

Go solve that problem first.
On Wed, Oct 10, 2018 at 12:36 PM  wrote:
>
> > But the reason I want this is to reduce latency to the first
> > access, especially for very large files. With read() I have
> > to wait until the read completes. With mmap() processing can
> > start much earlier and can be interleaved with background
> > data fetch or prefetch. With read() a lot more resources
> > are tied down. If I need random access and don't need to
> > read all of the data, the application has to do pread(),
> > pwrite() a lot thus complicating it. With mmap() I can just
> > map in the whole file and excess reading (beyond what the
> > app needs) will not be a large fraction.
>
> you think doing single 4K page sized reads in the pagefault
> handler is better than doing precise >4K reads from your
> application? possibly in a background thread so you can
> overlap processing with data fetching?
>
> the advantage of mmap is not prefetch. its about not to do
> any I/O when data is already in the *SHARED* buffer cache!
> which plan9 does not have (except the mntcache, but that is
> optional and only works for the disk fileservers that maintain
> ther file qid ver info consistently). its *IS* really a linux
> thing where all block device i/o goes thru the buffer cache.
>
> --
> cinap
>



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread Digby R.S. Tarvin
Well, I think 'avoid at all costs'  is a bit strong.

The Raspberry Pi is a good little platform for the right applications, so
long as you are aware of its limitations. I use one as my 'always on' home
server to give me access files when travelling (the networking is slow by
LAN standards, but ok for WAN), and another for my energy monitoring
system. It is good for experimenting with OS's, especially networking OS's
like Plan9 where price is important if you want to try a large number of
hosts. Its good for teaching/learning. Or for running/trying different
operating systems without having do spend time and resources setting up VMs
(downloading and flashing an sd card image is quick and takes up no space
on my main systems).

Just don't plan on deploying RPi's for mission critical applications that
have demanding I/O or processing requirements. It was never intended to
compete in that market.

On Wed, 10 Oct 2018 at 20:54, hiro <23h...@gmail.com> wrote:

> I agree, if you have a choice avoid rpi by all costs.
> Even if the software side of that other board was less pleasent at least
> it worked with my mouse and keyboard!! :)
>
> As I said I was looking at 2Mbit/s stuff, which is nothing, even over USB.
> But my point is that even though this number is low, the rpi is too limited
> to do any meaningful processing anyway (ignoring the usb troubles and lack
> of ethernet). It's a mobile phone soc after all, where the modulation is
> done by dedicated chips, not on cpu! :)
>
> On Wednesday, October 10, 2018, Digby R.S. Tarvin 
> wrote:
> > I don't know which other ARM board you tried, but I have always found
> terrible I/O performance of the Pi to be a bigger problem that the ARM
> speed.  The USB2 interface is really slow, and there arn't really many
> other (documented) alternative options. The Ethernet goes through the same
> slow USB interface, and there is only so much that you can do bit bashing
> data with GPIO's.  The sdCard interface seems to be the only non-usb
> filesystem I/O available. And that in turn limits the viability of
> relieving the RAM contraints with virtual memory. So the ARM processor
> itself is not usually the problem for me.
> > In general I find the pi a nice little device for quite a few things -
> like low power, low bandwidth, low cost servers or displays with plenty of
> open source compatability.. Or hacking/prototyping where I don't want to
> have to worry too much about blowing things up. But it not good for high
> throughput I/O,  memory intensive applications, or anything requiring a lot
> of processing power.
> > The validity of your conclusion regarding low power ARM in general
> probably depends on what the other board you tried was..
> > DigbyT
> > On Wed, 10 Oct 2018 at 17:51, hiro <23h...@gmail.com> wrote:
> >>
> >> > Eliminating as much of the copy in/out WRT the kernel cannot but
> >> > help, especially when you're doing SDR decoding near the radios
> >> > using low-powered compute hardware (think Pies and the like).
> >>
> >> Does this include demodulation on the pi? cause even when i dumped the
> >> pi i was given for that purpose (with a <2Mbit I/Q stream) and
> >> replaced it with some similar ARM platform that at least had neon cpu
> >> instruction extensions for faster floating point operations, I was
> >> barely able to run a small FFT.
> >>
> >> My conclusion was that these low-powered ARM systems are just good
> >> enough for gathering low-bandwidth, non-critical USB traffic, like
> >> those raw I/Q samples from a dongle, but unfit for anything else.
> >>
> >


Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread cinap_lenrek
> But the reason I want this is to reduce latency to the first
> access, especially for very large files. With read() I have
> to wait until the read completes. With mmap() processing can
> start much earlier and can be interleaved with background
> data fetch or prefetch. With read() a lot more resources
> are tied down. If I need random access and don't need to
> read all of the data, the application has to do pread(),
> pwrite() a lot thus complicating it. With mmap() I can just
> map in the whole file and excess reading (beyond what the
> app needs) will not be a large fraction.

you think doing single 4K page sized reads in the pagefault
handler is better than doing precise >4K reads from your
application? possibly in a background thread so you can
overlap processing with data fetching?

the advantage of mmap is not prefetch. its about not to do
any I/O when data is already in the *SHARED* buffer cache!
which plan9 does not have (except the mntcache, but that is
optional and only works for the disk fileservers that maintain
ther file qid ver info consistently). its *IS* really a linux
thing where all block device i/o goes thru the buffer cache.

--
cinap



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread cinap_lenrek
oh! you wrote a nvme driver TOO? where can i find it?

maybe we can share some knowledge. especially regarding
some quirks. i dont own hardware myself, so i wrote it
using an emulator over a weekend and tested it on a
work machine afterwork.

http://code.9front.org/hg/plan9front/log/9df9ef969856/sys/src/9/pc/sdnvme.c

--
cinap



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread erik quanstrom
> > with meltdown/Spectre mitigations in place, I would like to see evidence 
> > that flip is faster than copy.
> 
> If your system is well balanced, you should be able to
> stream data as fast as memory allows[1]. In such a system
> copying things N times will reduce throughput by similar
> factor. It may be that plan9 underperforms so much this
> doesn't matter normally.

sure.  but flipping page tables is also not free.  there is a huge cost in 
processor stalls, etc.
spectre and meltdown mitigations make this worse as each page flip has to be 
accompanied
by a complete pipeline flush or other costly mitigation.  (not that this was 
cheap to begin with)

it's also not an object to move data as fast as possible.  the object is to do 
work as fast as possible.

> [1] See: https://code.kx.com/q/cloud/aws/benchmarking/
> A single q process can ingest data at 1.9GB/s from a
> single drive. 16 can achieve 2.7GB/s, with theoretical
> max being 2.8GB/s.

with my same crappy un-optimized nvme driver, i was able to hit 2.5-2.6 GiB/s
with two very crappy nvme drives.  (are you're numbers really GB rather than 
GiB?)
i am sure i could scale that lineraly.  there's plenty of memory bandwidth 
left, but
i haven't got any more nvme.  :-)

similarly coraid built an appliance that did copying (due to cache) and hit 1 
million
4k iops.  this was in 2011 or so.

but, so what.  all this proves is that with copying or without, we can ingest 
enough
data for even the most hungry programs.

unless you have data that shows otherwise.  :-)

- erik



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread Steve Simon


people come down very hard on the pi.

here are my times for building the pi kernel. i rebuilt it a few times to push 
data into any caches available.

pi3+ with a high-ish spec sd card: 23 secs
dual intel atom 1.8Ghz with an SSD: 9 secs

the pi is slower, but not 10 times slower.
However it does cost a 10th of the price and consumes a 10th of the electricity.

i use the order of magnitude test as that is (in my experience) what you need 
to make a really noticeable difference (to stuff in general).

i use one daily as a plan9 terminal, for which i feel its ideal.

-Steve






Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread Ethan Gardener
On Tue, Oct 9, 2018, at 8:14 PM, Lyndon Nerenberg wrote:
> hiro writes:
> 
> > Huh? What exactly do you mean? Can you describe the scenario and the
> > measurements you made?
> 
> The big one is USB.  disk/radio->kernel->user-space-usbd->kernel->application.
> Four copies.
> 
> I would like to start playing with software defined radio on Plan
> 9, but that amount of data copying is going to put a lot of pressure
> on the kernel to keep up.  UNIX/Linux suffers the same copy bloat,
> and it's having trouble keeping up, too.

References, please.  Programmers are notoriously bad at determining the cause 
of performance problems.  Examining the source will help to see if "copy bloat" 
is the actual problem.

> 
> --lyndon
> 


-- 
Progress might have been all right once, but it has gone on too long -- Ogden 
Nash



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread Ethan Gardener
On Tue, Oct 9, 2018, at 11:22 PM, Digby R.S. Tarvin wrote:
> 
> 
> On Tue, 9 Oct 2018 at 23:00, Ethan Gardener  wrote:
>> 
>> Fascinating thread, but I think you're off by a decade with the 16-bit 
>> address bus comment, unless you're not actually talking about Plan 9.  The 
>> 8086 and 8088 were introduced with 20-bit addressing in 1978 and 1979 
>> respectively.  The IBM PC, launched in 1982, had its ROM at the top of that 
>> 1MByte space, so it couldn't have been constrained in that way.  By the end 
>> of the 80s, all my schoolmates had 68k-powered computers from Commodore and 
>> Atari, showing hardware with a 24-bit address space was very much affordable 
>> and ubiquitous at the time Plan 9 development started.  Almost all of them 
>> had 512KB at the time.  A few flashy gits had 1MB machines. :)
> 
> Not sure I would agree with that. The 20 bit addressing of the 8086 and 8088 
> did not change their 16 bit nature. They were still 16 bit program counter, 
> with segmentation to provide access to a larger memory - similar in principle 
> to the PDP11 with MMU. 

That's not at all the same as being constrained to 64KB memory.  Are we 
communicating at cross purposes here?  If we're not, if I haven't misunderstood 
you, you might want to read up on creating .exe files for MS-DOS.  

> The first 32 bit x86 processor was the 386, which I think came out in 1985, 
> very close to when work on Plan9 was rumored to have  started. So it seemed 
> not impossible that work might have started on an older 16 bit machine, but  
> at Bell Labs probably a long shot.

Mmh, rumors. I read they were starting to think about Plan 9 in 1985, but I 
haven't read anything about it being up and running until '89 or '90.  There's 
not much to go on.

>> I still wish I'd kept the better of the Atari STs which made their way down 
>> to me -- a "1040 STE" -- 1MB with a better keyboard and ROM than the earlier 
>> "STFM" models.  I remember wanting to try to run Plan 9 on it.  Let's 
>> estimate how tight it would be...
>>  
>>  I think it would be terrible, because I got frustrated enough trying to run 
>> a 4e CPU server with graphics on a 2GB x86.  I kept running out of image 
>> memory!  The trouble was the draw device in 4th edition stores images in the 
>> same "image memory" the kernel loads programs into, and the 386 CPU kernel 
>> 'only' allocates 64MB of that. :)  
>>  
>>  1 bit per pixel would obviously improve matters by a factor of 16 compared 
>> to my setup, and 640x400 (Atari ST high resolution) would be another 5 times 
>> smaller than my screen.  Putting these numbers together with my experience, 
>> you'd have to be careful to use images sparingly on a machine with 800KB 
>> free RAM after the kernel is loaded.  That's better than I thought, probably 
>> achievable on that Atari I had, but it couldn't be used as intensively as I 
>> used Plan 9 back then.  
>>  
>>  How could it be used?  I think it would be a good idea to push the draw 
>> device back to user space and make very sure to have it check for failing 
>> malloc!  I certainly wouldn't want a terminal with a filesystem and graphics 
>> all on a single 1MByte 64000-powered computer, because a filesystem on a 
>> terminal runs in user space, and thus requires some free memory to run the 
>> programs to shut it down.  Actually, Plan 9's separation of terminal from 
>> filesystem seems quite the obvious choice when I look at it like this. :)  
> 
> I went Commodore Amiga at about that time - because it at least supported 
> some form of multi-tasking out out the box, and I spent many happy hours 
> getting OS9 running on it.. An interesting architecture, capable of some 
> impressive graphics, but subject to quite severe limitations which made 
> general purpose graphics difficult. (Commodore later released SVR4 Unix for 
> the A3000, but limited X11 to monochrome when using the inbuilt graphics).

It does sound like fun. :)  I'm not surprised by the monochrome graphics 
limitation after my calculations.  Still, X11 or any other window system which 
lacks a backing store may do better in low-memory environments than Plan 9's 
present draw device.  It's a shame, a backing store is a great simplification 
for programmers.

> But being 32 bit didn't give it a huge advantage over the 16 bit x86 systems 
> for tinkering with operating system, because the 68000 had no MMU.  It was 
> easier to get a Unix like system going with 16 bit segmentation than a 32 bit 
> linear space and no hardware support for run time relocation.
> (OS9 used position independent code throughout to work without an MMU, but 
> didn't try to implement fork() semantics).

I'm sometimes tempted to think that fork() is freakishly high-level crazy 
stuff. :)  Still, like backing store, it's very nice to have.

> It wasn't till the 68030 based Amiga 3000 came out in 1990 that it really did 
> everything I wanted. The 68020 with an optional MMU was equivalent, but not 
> so common in 

Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread hiro
I agree, if you have a choice avoid rpi by all costs.
Even if the software side of that other board was less pleasent at least it
worked with my mouse and keyboard!! :)

As I said I was looking at 2Mbit/s stuff, which is nothing, even over USB.
But my point is that even though this number is low, the rpi is too limited
to do any meaningful processing anyway (ignoring the usb troubles and lack
of ethernet). It's a mobile phone soc after all, where the modulation is
done by dedicated chips, not on cpu! :)

On Wednesday, October 10, 2018, Digby R.S. Tarvin 
wrote:
> I don't know which other ARM board you tried, but I have always found
terrible I/O performance of the Pi to be a bigger problem that the ARM
speed.  The USB2 interface is really slow, and there arn't really many
other (documented) alternative options. The Ethernet goes through the same
slow USB interface, and there is only so much that you can do bit bashing
data with GPIO's.  The sdCard interface seems to be the only non-usb
filesystem I/O available. And that in turn limits the viability of
relieving the RAM contraints with virtual memory. So the ARM processor
itself is not usually the problem for me.
> In general I find the pi a nice little device for quite a few things -
like low power, low bandwidth, low cost servers or displays with plenty of
open source compatability.. Or hacking/prototyping where I don't want to
have to worry too much about blowing things up. But it not good for high
throughput I/O,  memory intensive applications, or anything requiring a lot
of processing power.
> The validity of your conclusion regarding low power ARM in general
probably depends on what the other board you tried was..
> DigbyT
> On Wed, 10 Oct 2018 at 17:51, hiro <23h...@gmail.com> wrote:
>>
>> > Eliminating as much of the copy in/out WRT the kernel cannot but
>> > help, especially when you're doing SDR decoding near the radios
>> > using low-powered compute hardware (think Pies and the like).
>>
>> Does this include demodulation on the pi? cause even when i dumped the
>> pi i was given for that purpose (with a <2Mbit I/Q stream) and
>> replaced it with some similar ARM platform that at least had neon cpu
>> instruction extensions for faster floating point operations, I was
>> barely able to run a small FFT.
>>
>> My conclusion was that these low-powered ARM systems are just good
>> enough for gathering low-bandwidth, non-critical USB traffic, like
>> those raw I/Q samples from a dongle, but unfit for anything else.
>>
>


Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread Digby R.S. Tarvin
I don't know which other ARM board you tried, but I have always found
terrible I/O performance of the Pi to be a bigger problem that the ARM
speed.  The USB2 interface is really slow, and there arn't really many
other (documented) alternative options. The Ethernet goes through the same
slow USB interface, and there is only so much that you can do bit bashing
data with GPIO's.  The sdCard interface seems to be the only non-usb
filesystem I/O available. And that in turn limits the viability of
relieving the RAM contraints with virtual memory. So the ARM processor
itself is not usually the problem for me.

In general I find the pi a nice little device for quite a few things - like
low power, low bandwidth, low cost servers or displays with plenty of open
source compatability.. Or hacking/prototyping where I don't want to have to
worry too much about blowing things up. But it not good for high throughput
I/O,  memory intensive applications, or anything requiring a lot of
processing power.

The validity of your conclusion regarding low power ARM in general probably
depends on what the other board you tried was..

DigbyT

On Wed, 10 Oct 2018 at 17:51, hiro <23h...@gmail.com> wrote:

> > Eliminating as much of the copy in/out WRT the kernel cannot but
> > help, especially when you're doing SDR decoding near the radios
> > using low-powered compute hardware (think Pies and the like).
>
> Does this include demodulation on the pi? cause even when i dumped the
> pi i was given for that purpose (with a <2Mbit I/Q stream) and
> replaced it with some similar ARM platform that at least had neon cpu
> instruction extensions for faster floating point operations, I was
> barely able to run a small FFT.
>
> My conclusion was that these low-powered ARM systems are just good
> enough for gathering low-bandwidth, non-critical USB traffic, like
> those raw I/Q samples from a dongle, but unfit for anything else.
>
>


Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread Giacomo Tesio
Il giorno mar 9 ott 2018 alle ore 05:33 Lucio De Re
 ha scritto:
>
> On 10/9/18, Bakul Shah  wrote:
> >
> > One thing I have mused about is recasting plan9 as a
> > microkernel and pushing out a lot of its kernel code into user
> > mode code.  It is already half way there -- it is basically a
> > mux for 9p calls, low level device drivers,
> >
> There are religious reasons not to go there

Indeed, as an heretic, one of the first things I did with Jehanne was
to move the console filesystem out of kernel.
Then I moved several syscalls into userspace. Or turned them to files
or to operation on existing files.
More syscall/kernel services will move to user space as I'll have time
to hack it again.

You know... heretics ruin everything!

I'm not going to turn Jehanne to a microkernel, but I'm looking for
the simplest possible set of kernel abstractions that can support a
distributed operating system able to replace the mainstream Web+OS
mess.
You know... heretics are crazy, too!


Giacomo



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread erik quanstrom
zero copy is also the source of the dreaded 'D' state.- Erik

Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread Bakul Shah
On Oct 9, 2018, at 3:06 PM, erik quanstrom  wrote:
> 
> with meltdown/Spectre mitigations in place, I would like to see evidence that 
> flip is faster than copy.

If your system is well balanced, you should be able to
stream data as fast as memory allows[1]. In such a system
copying things N times will reduce throughput by similar
factor. It may be that plan9 underperforms so much this
doesn't matter normally.

But the reason I want this is to reduce latency to the first
access, especially for very large files. With read() I have
to wait until the read completes. With mmap() processing can
start much earlier and can be interleaved with background
data fetch or prefetch. With read() a lot more resources
are tied down. If I need random access and don't need to
read all of the data, the application has to do pread(),
pwrite() a lot thus complicating it. With mmap() I can just
map in the whole file and excess reading (beyond what the
app needs) will not be a large fraction.

The default assumption here seems to be that doing this
will be very complicated and be as bad as on Linux. But
Linux is not a good model of what to do and examples of what
not to do are not useful guides in system design. There are
other OSes such as the old Apollo Aegis (AKA Apollo/Domain),
KeyKOS & seL4 that avoid copying[2].

Though none of this matters right now as we don't even have
a paper design so please put down your clubs and swords :-)

[1] See: https://code.kx.com/q/cloud/aws/benchmarking/
A single q process can ingest data at 1.9GB/s from a
single drive. 16 can achieve 2.7GB/s, with theoretical
max being 2.8GB/s.

[2] Liedke's original L4 evolved into a provably secure
seL4 and in the process it became very much like KeyKOS.
Capability systems do pass around pages as protected
objects and avoid copying. Sort of like how in a program
you'd pass a huge array by reference and not by value
to a function.





Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-10 Thread erik quanstrom
I have been able to copy 1 GiB/s to userspace from an nvme device.  I should think a radio should be no problem.- Erik

Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread hiro
> via USB and see how it stands up.  But the real question is what
> kind of delay, latency, and jitter will there be, getting that raw
> I/Q data from the USB interface up to the consuming application?

How is your proposal of zero-copy going to help latency? IIRC we have
some real-time thingy, might be able to reduce jitter...
But then I might also ask why you're not doing the most critical path
on an fpga anyway?
Start with identifying your worst bottleneck.

> Eliminating as much of the copy in/out WRT the kernel cannot but
> help

wrong, this design change requires ressources, too, and might gain you
higher complexity. measure first.



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread hiro
> Eliminating as much of the copy in/out WRT the kernel cannot but
> help, especially when you're doing SDR decoding near the radios
> using low-powered compute hardware (think Pies and the like).

Does this include demodulation on the pi? cause even when i dumped the
pi i was given for that purpose (with a <2Mbit I/Q stream) and
replaced it with some similar ARM platform that at least had neon cpu
instruction extensions for faster floating point operations, I was
barely able to run a small FFT.

My conclusion was that these low-powered ARM systems are just good
enough for gathering low-bandwidth, non-critical USB traffic, like
those raw I/Q samples from a dongle, but unfit for anything else.



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread hiro
I was responding to lyndon's comment on certain "experiments" that
should have to be done here, 2 messages up.
But what he described sounded exactly like the zero-copying stuff that
linux is trying to shove into everything.
I have not made any statement about non-linux systems, and I'm not
even saying these experiments couldn't be done on plan9, it's just
that the linux people are way busier going down that path.

On 10/10/18, Dan Cross  wrote:
> On Tue, Oct 9, 2018 at 7:24 PM hiro <23h...@gmail.com> wrote:
>
>> from what i see in linux people have been more than just exploring it,
>> they've gone absolutely nuts. it makes everything complex, not just
>> the fast path.
>>
>
> To whom are you responding? Your email is devoid of context, so it is not
> clear.
>
> However your statement appears to be based on an unstated assumption that
> there is a plan9 school of thought, and a Linux school of thought, and no
> other school of thought. If so, that is incorrect.
>
> - Dan C.
>



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread Dan Cross
On Tue, Oct 9, 2018 at 7:24 PM hiro <23h...@gmail.com> wrote:

> from what i see in linux people have been more than just exploring it,
> they've gone absolutely nuts. it makes everything complex, not just
> the fast path.
>

To whom are you responding? Your email is devoid of context, so it is not
clear.

However your statement appears to be based on an unstated assumption that
there is a plan9 school of thought, and a Linux school of thought, and no
other school of thought. If so, that is incorrect.

- Dan C.


Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread Lyndon Nerenberg
cinap_len...@felloff.net writes:

> why? the *HOST CONTROLLER* schedules the data transfers.

I *DON'T KNOW*.  It's just observed behaviour.

> a! we'r talking about some crappy raspi here... probably with all
> caches disabled... never mind.

Hah.  An Rpi tips over with 1200 baud USB serial.  I was talking
about "real" (Intel :-P) hardware for the other tippy-over behaviour.

--lyndon



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread cinap_lenrek
> To address Hiro's comments, I have no benchmarks on Plan 9, because
> the SDR code I run does not exist there.  But I do have experience
> with running SDR on Linux and FreeBSD with hardware like the HackRF
> One.  That hardware can easily saturate a USB2 interface/driver on
> both of those operating systems.  Given my experience with USB on
> Plan 9 to date, it's a safe bet that all the variants would die
> when presented with that amount of traffic. 

why? the *HOST CONTROLLER* schedules the data transfers. if the
program doesnt do a read() theres nothing to schedule... (unless
its isochronous endpoint, in which case the controller dma's for
you in the background at the specified sampling rate).

> (I can knock down a Plan9 system with 56 Kb/s USB serial traffic.)

that sounds seriously scewed up. i have no issues here reading a usb
stick on my x230 with xhci at 32MB/s, not using any fancy streaming
optimization. no load at all. and this is just some garbage from the
supermarket.

> I can see about
> twisting up some code that would read the raw I/Q data from the SDR
> via USB and see how it stands up.  But the real question is what
> kind of delay, latency, and jitter will there be, getting that raw
> I/Q data from the USB interface up to the consuming application?

is this a isochronous endpoint? in that case you would not have to
worry much as the controller does all the timing for you in hardware.

> Eliminating as much of the copy in/out WRT the kernel cannot but
> help, especially when you're doing SDR decoding near the radios
> using low-powered compute hardware (think Pies and the like).

a! we'r talking about some crappy raspi here... probably with all
caches disabled... never mind.

> --lyndon

--
cinap



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread Lyndon Nerenberg
cinap_len...@felloff.net writes:
> > The big one is USB.  disk/radio->kernel->user-space-usbd->kernel->applicati
> on.
> > Four copies.
>
> that sounds wrong.
>
> usbd is not involved in the data transfer.

You're right, I was wrong about 'usbd'.  In the bits of testing
I've done with this, 'usbd' is replaces with a user space file
server that abstracts the hardware and presents a useful file system
interface.  (E.g. along the lines of the gps filesystem interface.)

To address Hiro's comments, I have no benchmarks on Plan 9, because
the SDR code I run does not exist there.  But I do have experience
with running SDR on Linux and FreeBSD with hardware like the HackRF
One.  That hardware can easily saturate a USB2 interface/driver on
both of those operating systems.  Given my experience with USB on
Plan 9 to date, it's a safe bet that all the variants would die
when presented with that amount of traffic. (I can knock down a
Plan9 system with 56 Kb/s USB serial traffic.)  I can see about
twisting up some code that would read the raw I/Q data from the SDR
via USB and see how it stands up.  But the real question is what
kind of delay, latency, and jitter will there be, getting that raw
I/Q data from the USB interface up to the consuming application?

Eliminating as much of the copy in/out WRT the kernel cannot but
help, especially when you're doing SDR decoding near the radios
using low-powered compute hardware (think Pies and the like).

--lyndon



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread Digby R.S. Tarvin
On Tue, 9 Oct 2018 at 23:00, Ethan Gardener  wrote:

>
> Fascinating thread, but I think you're off by a decade with the 16-bit
> address bus comment, unless you're not actually talking about Plan 9.  The
> 8086 and 8088 were introduced with 20-bit addressing in 1978 and 1979
> respectively.  The IBM PC, launched in 1982, had its ROM at the top of that
> 1MByte space, so it couldn't have been constrained in that way.  By the end
> of the 80s, all my schoolmates had 68k-powered computers from Commodore and
> Atari, showing hardware with a 24-bit address space was very much
> affordable and ubiquitous at the time Plan 9 development started.  Almost
> all of them had 512KB at the time.  A few flashy gits had 1MB machines. :)
>

Not sure I would agree with that. The 20 bit addressing of the 8086 and
8088 did not change their 16 bit nature. They were still 16 bit program
counter, with segmentation to provide access to a larger memory - similar
in principle to the PDP11 with MMU.

The first 32 bit x86 processor was the 386, which I think came out in 1985,
very close to when work on Plan9 was rumored to have  started. So it seemed
not impossible that work might have started on an older 16 bit machine,
but  at Bell Labs probably a long shot.


> I still wish I'd kept the better of the Atari STs which made their way
> down to me -- a "1040 STE" -- 1MB with a better keyboard and ROM than the
> earlier "STFM" models.  I remember wanting to try to run Plan 9 on it.
> Let's estimate how tight it would be...
>
> I think it would be terrible, because I got frustrated enough trying to
> run a 4e CPU server with graphics on a 2GB x86.  I kept running out of
> image memory!  The trouble was the draw device in 4th edition stores images
> in the same "image memory" the kernel loads programs into, and the 386 CPU
> kernel 'only' allocates 64MB of that. :)
>
> 1 bit per pixel would obviously improve matters by a factor of 16 compared
> to my setup, and 640x400 (Atari ST high resolution) would be another 5
> times smaller than my screen.  Putting these numbers together with my
> experience, you'd have to be careful to use images sparingly on a machine
> with 800KB free RAM after the kernel is loaded.  That's better than I
> thought, probably achievable on that Atari I had, but it couldn't be used
> as intensively as I used Plan 9 back then.
>
> How could it be used?  I think it would be a good idea to push the draw
> device back to user space and make very sure to have it check for failing
> malloc!  I certainly wouldn't want a terminal with a filesystem and
> graphics all on a single 1MByte 64000-powered computer, because a
> filesystem on a terminal runs in user space, and thus requires some free
> memory to run the programs to shut it down.  Actually, Plan 9's separation
> of terminal from filesystem seems quite the obvious choice when I look at
> it like this. :)
>

I went Commodore Amiga at about that time - because it at least supported
some form of multi-tasking out out the box, and I spent many happy hours
getting OS9 running on it.. An interesting architecture, capable of some
impressive graphics, but subject to quite severe limitations which made
general purpose graphics difficult. (Commodore later released SVR4 Unix for
the A3000, but limited X11 to monochrome when using the inbuilt graphics).

But being 32 bit didn't give it a huge advantage over the 16 bit x86
systems for tinkering with operating system, because the 68000 had no MMU.
It was easier to get a Unix like system going with 16 bit segmentation than
a 32 bit linear space and no hardware support for run time relocation.
(OS9 used position independent code throughout to work without an MMU, but
didn't try to implement fork() semantics).

It wasn't till the 68030 based Amiga 3000 came out in 1990 that it really
did everything I wanted. The 68020 with an optional MMU was equivalent, but
not so common in consumer machines.

Hardware progress seems to have been rather uninteresting since then. Sure,
hardware is *much* faster and *much* bigger, but fundamentally the same
architecture. Intel had a brief flirtation with a novel architecture with
the iAPX 432 in 81, but obviously found that was more profitable making the
familiar architecture bigger and faster .


Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread hiro
he has ignored my questions about measurement, so i'm sure he hasn't

On 10/9/18, cinap_len...@felloff.net  wrote:
> also, i wonder how much is the actual copy overhead you claim is the issue.
> maybe the impact for copying is more dominated by the memory allocator used
> for allocb(). have you measured?
>
> --
> cinap
>
>



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread cinap_lenrek
also, i wonder how much is the actual copy overhead you claim is the issue.
maybe the impact for copying is more dominated by the memory allocator used
for allocb(). have you measured?

--
cinap



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread cinap_lenrek
> The big one is USB.  disk/radio->kernel->user-space-usbd->kernel->application.
> Four copies.

that sounds wrong.

usbd is not involved in the data transfer. it mainly is just responsible to
enumerating devices and instantiating drivers and registering the endpoints
in devusb. after that you access the endpoint files from devusb which goes
directly to the kernel. devusb also allows you to create a alias for a
endpoint file which then appears directly under /dev. usb audio uses this
mechanism. the usb driver just activates the device and provides the ctl/volume
files, while audio data is handled by the kernel's devusb.

on another remark regarding zero copy. the reason plan9 drivers are small comes
from NOT doing these "optimizations". identity mapping the low part of memory
in the kernel avoids alot of trouble and allows you to get DMA capable memory
with just wrapping a pointer in PADDR(va). no page lists needed. no MMU tricks
needed in the drivers. you can use any kernel memory va for DMA... even your
kernel stack! its never paged out. you can be sure it is not changed while the
device looks at it ect. do not underestimate the impact of this 
"simplification".

linux block layer is broken in that regard btw. it just hands user pages into
the drivers without making sure they do not change while the i/o is in flight,
which results in all kinds of false-negatives when you actually start verifying
your raid arrays as different snapshots in time got written out to the raid
members. they know about this and ignore it because benchmarks are more 
important.

--
cinap



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread Lyndon Nerenberg
hiro writes:
> from what i see in linux people have been more than just exploring it,
> they've gone absolutely nuts. it makes everything complex, not just
> the fast path.

And those are the Linux folks doing thier thing.  The reading I'm
doing right now is related to the pessimizations page flipping throws
at the CPU caches.  It looks scary ...


--lyndon



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread hiro
also, if all you care about is throughput, i don't see how those 4
copies you identified makes a difference. especially with something
slow like USB.



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread hiro
from what i see in linux people have been more than just exploring it,
they've gone absolutely nuts. it makes everything complex, not just
the fast path.



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread Lyndon Nerenberg
Bakul Shah writes:

And funny you should mention this!

> Some of this process/memory management can be delegated to
> user code as well.

At $DAYJOB we would really like to have application process control
over the kernel scheduler, as this seems to be the only realistic
way to avoid the (kernel) resource starvation issues we run into.

Our back end servers don't go down often.  But when they do, it's for
reasons entirely out of our control.  Because those resource allocation
policies have been pushed into the kernel, and beyond our control.


--lyndon



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread Lyndon Nerenberg
hiro writes:

> > Dealing with the security issues isn't trivial

> what security issues?

Passing protocol buffer like objects around user space, that might
affect how the kernel talks to hardware.  E.g. IPsec offload into
hardware.  You don't want user-space messing with that sort of
context, but you want to tag it with the data buffer as it gets
passed up and down through the user/kernel gate.  Practical page
flipping needs a kernel-read-only context attached to the non-kernel
user data part of the page.  A quick solution is to pair pages, one
half of which the kernel owns, the other being the data payload.  But
that't just a start.  And that's all I'm saying: this might be an
approach to a better/faster I/O paradigm, but it needs interested
people to explore it ...


--lyndon



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread Lyndon Nerenberg
hiro writes:

> Huh? What exactly do you mean? Can you describe the scenario and the
> measurements you made?

The big one is USB.  disk/radio->kernel->user-space-usbd->kernel->application.
Four copies.

I would like to start playing with software defined radio on Plan
9, but that amount of data copying is going to put a lot of pressure
on the kernel to keep up.  UNIX/Linux suffers the same copy bloat,
and it's having trouble keeping up, too.

--lyndon



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread Ori Bernstein
On Tue, 9 Oct 2018 10:50:08 -0700
Bakul Shah  wrote:

> Exactly! No point in being scared by labels! I am really
> only talking about distilling plan9 further. At least as a
> thought experiment.
> 
> Isn’t it more fun to discuss this than all the “heavy
> negativity”? :-)

It's much better with patches.

-- 
Ori Bernstein 



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread hiro
> E.g. right now Plan 9 suffers from a *lot* of data copying between
> the kernel and processes, and between processes themselves.

Huh? What exactly do you mean? Can you describe the scenario and the
measurements you made?

> If we could eliminate most of that copying, things would get a lot faster.

Which things would get faster?

> Dealing with the security issues isn't trivial

what security issues?



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread Bakul Shah
> On Oct 9, 2018, at 2:45 AM, Ethan Gardener  wrote:
> 
> One day, Uriel met a man who explained very 
> convincingly that the Plan 9 kernel is a microkernel.
> On another day, Uriel met a man who explained very 
> convincingly that the Plan 9 kernel is a macrokernel.
> Uriel was enlightened.

Exactly! No point in being scared by labels! I am really
only talking about distilling plan9 further. At least as a
thought experiment.

Isn’t it more fun to discuss this than all the “heavy
negativity”? :-)

Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread Lyndon Nerenberg
Bakul Shah writes:

> One thing I have mused about is recasting plan9 as a
> microkernel and pushing out a lot of its kernel code into user
> mode code.  It is already half way there -- it is basically a
> mux for 9p calls, low level device drivers, VM support & some
> process related code.

Somewhat related to this ... after reading some papers on
TCP-in-user-space implementations, I've been thinking about how an
interface that supported fast/secure page flipping between the
kernel and process address space would change how we do things.

E.g. right now Plan 9 suffers from a *lot* of data copying between
the kernel and processes, and between processes themselves.  If we
could eliminate most of that copying, things would get a lot faster.
Dealing with the security issues isn't trivial, but the programmer
time going into eeking out the last bit of I/O throughput of the
current scheme could be redirected.

If it works, this would reduce the kernel back to handling
process/memory management, and talking to the hardware.  Not a
micro-kernel, but just as good from a practical standpoint.

And no, this wouldn't get us to running on the 11/70.  But by taking
advantage of modern large virtual memory spaces by using page
flipping, we could cut down on physical memory usage in the kernel.


--lyndon



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread erik quanstrom
>  From what I recall, PDP11 hardware memory management was based on
> segmentation rather than paging (64K divided into 16 variable sized
> segments), and Unix did swapping rather than paging (a process is either
> completely in memory or completely on disk). It does relocation and

completely in memory /and running/. or swapped out.

- erik



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread erik quanstrom
> I think it would be terrible, because I got frustrated enough trying to run a 
> 4e CPU server with graphics on a 2GB x86.  I kept running out of image 
> memory!  The trouble was the draw device in 4th edition stores images in the 
> same "image memory" the kernel loads programs into, and the 386 CPU kernel 
> 'only' allocates 64MB of that. :)  

this was changed long ago.  image memory can now be much bigger.  i never had a 
problem when a 4e terminal
was my daily driver.

- erik



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread Ethan Gardener
On Tue, Oct 9, 2018, at 4:08 AM, Digby R.S. Tarvin wrote:
> I thought there might have been a chance of an early attempt to target the 
> x86 because of its ubiquity and low cost - which could be useful for a 
> networked operating system. And those were 16 bit address constrained in the 
> early days. But its probably not an architecture you would choose to work 
> with if you had a choice.. 68K is what I would have gone for..

Fascinating thread, but I think you're off by a decade with the 16-bit address 
bus comment, unless you're not actually talking about Plan 9.  The 8086 and 
8088 were introduced with 20-bit addressing in 1978 and 1979 respectively.  The 
IBM PC, launched in 1982, had its ROM at the top of that 1MByte space, so it 
couldn't have been constrained in that way.  By the end of the 80s, all my 
schoolmates had 68k-powered computers from Commodore and Atari, showing 
hardware with a 24-bit address space was very much affordable and ubiquitous at 
the time Plan 9 development started.  Almost all of them had 512KB at the time. 
 A few flashy gits had 1MB machines. :)

I still wish I'd kept the better of the Atari STs which made their way down to 
me -- a "1040 STE" -- 1MB with a better keyboard and ROM than the earlier 
"STFM" models.  I remember wanting to try to run Plan 9 on it.  Let's estimate 
how tight it would be...

I think it would be terrible, because I got frustrated enough trying to run a 
4e CPU server with graphics on a 2GB x86.  I kept running out of image memory!  
The trouble was the draw device in 4th edition stores images in the same "image 
memory" the kernel loads programs into, and the 386 CPU kernel 'only' allocates 
64MB of that. :)  

1 bit per pixel would obviously improve matters by a factor of 16 compared to 
my setup, and 640x400 (Atari ST high resolution) would be another 5 times 
smaller than my screen.  Putting these numbers together with my experience, 
you'd have to be careful to use images sparingly on a machine with 800KB free 
RAM after the kernel is loaded.  That's better than I thought, probably 
achievable on that Atari I had, but it couldn't be used as intensively as I 
used Plan 9 back then.  

How could it be used?  I think it would be a good idea to push the draw device 
back to user space and make very sure to have it check for failing malloc!  I 
certainly wouldn't want a terminal with a filesystem and graphics all on a 
single 1MByte 64000-powered computer, because a filesystem on a terminal runs 
in user space, and thus requires some free memory to run the programs to shut 
it down.  Actually, Plan 9's separation of terminal from filesystem seems quite 
the obvious choice when I look at it like this. :)  



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread Ethan Gardener
On Tue, Oct 9, 2018, at 4:28 AM, Lucio De Re wrote:
> On 10/9/18, Bakul Shah  wrote:
> > One thing I have mused about is recasting plan9 as a
> > microkernel and pushing out a lot of its kernel code into user
> > mode code.  
> >
> There are religious reasons not to go there 

I'm trying to forget all the religious beliefs I once held with regard to 
computers, but I've had these lines in my head for a long time, and probably 
won't get a better opportunity to post them:

One day, Uriel met a man who explained very 
convincingly that the Plan 9 kernel is a microkernel.
On another day, Uriel met a man who explained very 
convincingly that the Plan 9 kernel is a macrokernel.
Uriel was enlightened.

Based on a true story. ;)


> You won't believe what kind of madnesses I need to deal with to
> consume my few and short remaining years - I'm with Dan in cursing the
> modern technological trends, but one of these days I'm going to lock
> myself in someone's attic or basement (or a prison cell, if that's
> what it takes, a monastery, whatever...) with my Galaxy S4 and a dated
> Riff-box - is that really what this black object is called? - and
> build an OS from the accumulated wisdom of the last forty years. It
> will probably look more like MS-DOS, though! :-(

I've started already, but I keep getting sidetracked by my need for 
entertainment, which often comes down to spending my energies on things which 
don't require such deep design work.  I'm hoping it'll get easier as my health 
improves; I'm still too stressed too often.  The trouble with this stress is I 
forget my goals, which are things I've learned from Plan 9 and other 
conclusions I've come to.  

-- 
Progress might have been all right once, but it has gone on too long -- Ogden 
Nash



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-09 Thread hiro
we already have a lot of user filesystems. feel free to add other useful ones.



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-08 Thread Lucio De Re
On 10/9/18, Bakul Shah  wrote:
>
> One thing I have mused about is recasting plan9 as a
> microkernel and pushing out a lot of its kernel code into user
> mode code.  It is already half way there -- it is basically a
> mux for 9p calls, low level device drivers, VM support & some
> process related code.  Such a redesign can be made more secure
> and more resilient.  The kind of problems you mention are
> easier to fix in user code. Different application domains may
> have different needs which are better handled as optional user
> mode components.
>
There are religious reasons not to go there and, perhaps not very
widely advertised, Minix-3 already does that, although I confess that
all my best efforts have not yet created the space for my own
experimentation with it.

You won't believe what kind of madnesses I need to deal with to
consume my few and short remaining years - I'm with Dan in cursing the
modern technological trends, but one of these days I'm going to lock
myself in someone's attic or basement (or a prison cell, if that's
what it takes, a monastery, whatever...) with my Galaxy S4 and a dated
Riff-box - is that really what this black object is called? - and
build an OS from the accumulated wisdom of the last forty years. It
will probably look more like MS-DOS, though! :-(

> Said another way, keep the good parts of the plan9 design and
> reachitect/reimplement the kernel + essential drivers/usermode
> daemons.  This is unlikely to happen (without some serious
> funding) but still fun to think about!  If done, this would be
> a more radical departure than Oberon-7 compared to Oberon but
> in the same spirit.
>
Surely, the targets for experimentation should be the ubiquitous
smart-mobile and the insane arithmetic power of GPUs? All neatly
networked over SDLC (or HDLC: AoH, anyone, for persistent storage?).

Lucio.



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-08 Thread Digby R.S. Tarvin
On Tue, 9 Oct 2018 at 10:07, Dan Cross  wrote:

> My guess is that there is no reason in principle that it could not fit
>> comfortably into the constraints of a PDP11/70, but if the initial
>> implementation was done targeting a machine with significantly more
>> resources, it would be easy to make design decisions that would be entirely
>> incompatible.
>>
>
> I find this unlikely.
>
> The PDP-11, while a respectable machine for its day, required too many
> tradeoffs to make it attractive as a development platform for a
> next-generation research operating system in the late 1980s: be it
> electrical power consumption vs computational oomph or dollar cost vs
> available memory, the -11 had fallen from the attractive position it held a
> decade prior. Perhaps slimming a plan9 kernel down sufficiently so that it
> COULD run on a PDP-11 was possible in the early days, but I can't see any
> reason one would have WANTED to do so: particularly as part of the impetus
> behind plan9 was to exploit advances in contemporary hardware: lower-cost,
> higher-performance, RISC-based multiprocessors; ubiquitous networking;
> common high-resolution bitmapped graphical displays; even magneto-optical
> storage (one bet that didn't pan out); etc.
>

If you  mean that you find it unlikely that that development would have
been done on a PDP11, then I agree, for the reasons you mentioned.

Not sure that I can see why it wouldn't  have been feasible, but I can see
why it wouldn't have been desirable.

I thought there might have been a chance of an early attempt to target the
x86 because of its ubiquity and low cost - which could be useful for a
networked operating system. And those were 16 bit address constrained in
the early days. But its probably not an architecture you would choose to
work with if you had a choice.. 68K is what I would have gone for..


> Certainly Richard Millar's comment suggests that might be the case. If it
>> is heavily dependent on VM, then the necessary rewrite is likely to be
>> substantial.
>>
>
> As a demonstration project, getting a slimmed-down plan9 kernel to boot on
> a PDP-11/70-class machine would be a nifty hack, but it would be quite a
> tour de force and most likely the result would not be generally useful. I
> think that, as has been suggested, the conceptual simplicity of plan9
> paradoxically means that resource utilization is higher than it might
> otherwise be on either a more elaborate OR more constrained system (such as
> one targeting e.g. the PDP-11). When you can afford not to care about a few
> bytes here or a couple of cycles there and you're not obsessed with
> scraping out the very last drop of performance, you can employ a simpler
> (some might say 'naive') algorithm or data structure.
>
> I'm not sure how the kernel design has changed since the first release.
>> The earliest version I have is the release I bought through Harcourt Brace
>> back in 1995. But I won't be home till December so it will be a while
>> before I can look at it, and probably won't have time to experiment before
>> then in any case.
>>
>
> The kernel evolved substantially over its life; something like doubling in
> size. I remember vaguely having a discussion with Sape where he said he
> felt it had grown bloated. That was probably close to 20 years ago now.
>

I guess kernel size wasn't a priority. I did a bit of searching back
through the old papers, and whilst there is a lot of talk about lines of
code and numbers of system calls, I didn't find any reference to kernel
size or memory requirements.


> For what it is worth, I don't think the embarrassment of riches presented
>> to programmers by current hardware has tended to produce more elegant
>> designs. If more resources resulted in elegance, Windows would be a thing
>> of beauty.  Perhaps Plan9 is an exception. It certainly excels in elegance
>> and design simplicity, even if it does turn out to be more resource hungry
>> than I imagined. I will admit that the evils of excessively constrained
>> environments are generally worse in terms of coding elegance - especially
>> when it leads to overlays and self modifying code.
>>
>
> plan9 is breathtakingly elegant, but this is in no small part because as a
> research system it had the luxury of simply ignoring many thorny problems
> that would have marred that beauty but that the developers chose not to
> tackle. Some of these problems have non-trivial domain complexity and,
> while "modern" systems are far too complex by far, that doesn't mean that
> all solutions can be recast as elegantly simple pearls in the plan9 style.
> Whether we like those problems or not, they exist and real-world solutions
> have to at least attempt to deal with them (I'm looking at you, web x.0 for
> x >= 2...but curse you you aren't alone).
>
> PDP11's don't support virtual memory, so there doesn't seem any elegant
>> way to overcome that fundamental limitation on size of a singe executable.
>>
>
> No, they do: there is 

Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-08 Thread Christopher Nielsen
On Mon, Oct 8, 2018, 17:15 Bakul Shah  wrote:

> On Mon, 08 Oct 2018 19:03:49 -0400 Dan Cross  wrote:
> >
> > plan9 is breathtakingly elegant, but this is in no small part because as
> a
> > research system it had the luxury of simply ignoring many thorny problems
> > that would have marred that beauty but that the developers chose not to
> > tackle. Some of these problems have non-trivial domain complexity and,
> > while "modern" systems are far too complex by far, that doesn't mean that
> > all solutions can be recast as elegantly simple pearls in the plan9
> style.
>
> One thing I have mused about is recasting plan9 as a
> microkernel and pushing out a lot of its kernel code into user
> mode code.  It is already half way there -- it is basically a
> mux for 9p calls, low level device drivers, VM support & some
> process related code.  Such a redesign can be made more secure
> and more resilient.  The kind of problems you mention are
> easier to fix in user code. Different application domains may
> have different needs which are better handled as optional user
> mode components.
>
> Said another way, keep the good parts of the plan9 design and
> reachitect/reimplement the kernel + essential drivers/usermode
> daemons.  This is unlikely to happen (without some serious
> funding) but still fun to think about!  If done, this would be
> a more radical departure than Oberon-7 compared to Oberon but
> in the same spirit.
>

I've mused about that also. My problem has been finding the time. I think
it would be a worthwhile project.

Not entirely unrelated, I've been tinkering with seL4.

>


Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-08 Thread Bakul Shah
On Mon, 08 Oct 2018 19:03:49 -0400 Dan Cross  wrote:
>
> plan9 is breathtakingly elegant, but this is in no small part because as a
> research system it had the luxury of simply ignoring many thorny problems
> that would have marred that beauty but that the developers chose not to
> tackle. Some of these problems have non-trivial domain complexity and,
> while "modern" systems are far too complex by far, that doesn't mean that
> all solutions can be recast as elegantly simple pearls in the plan9 style.

One thing I have mused about is recasting plan9 as a
microkernel and pushing out a lot of its kernel code into user
mode code.  It is already half way there -- it is basically a
mux for 9p calls, low level device drivers, VM support & some
process related code.  Such a redesign can be made more secure
and more resilient.  The kind of problems you mention are
easier to fix in user code. Different application domains may
have different needs which are better handled as optional user
mode components.

Said another way, keep the good parts of the plan9 design and
reachitect/reimplement the kernel + essential drivers/usermode
daemons.  This is unlikely to happen (without some serious
funding) but still fun to think about!  If done, this would be
a more radical departure than Oberon-7 compared to Oberon but
in the same spirit.



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-08 Thread Dan Cross
On Mon, Oct 8, 2018 at 6:25 PM Digby R.S. Tarvin  wrote:

> Does anyone know what platform Plan9 was initially implemented on?
>

My understanding is that the earliest experiments involved a VAX, but
development quickly shifted to MIPS and 68020-based machines (the "gnot"
was, IIRC, a 68020-based computer).

My guess is that there is no reason in principle that it could not fit
> comfortably into the constraints of a PDP11/70, but if the initial
> implementation was done targeting a machine with significantly more
> resources, it would be easy to make design decisions that would be entirely
> incompatible.
>

I find this unlikely.

The PDP-11, while a respectable machine for its day, required too many
tradeoffs to make it attractive as a development platform for a
next-generation research operating system in the late 1980s: be it
electrical power consumption vs computational oomph or dollar cost vs
available memory, the -11 had fallen from the attractive position it held a
decade prior. Perhaps slimming a plan9 kernel down sufficiently so that it
COULD run on a PDP-11 was possible in the early days, but I can't see any
reason one would have WANTED to do so: particularly as part of the impetus
behind plan9 was to exploit advances in contemporary hardware: lower-cost,
higher-performance, RISC-based multiprocessors; ubiquitous networking;
common high-resolution bitmapped graphical displays; even magneto-optical
storage (one bet that didn't pan out); etc.

Certainly Richard Millar's comment suggests that might be the case. If it
> is heavily dependent on VM, then the necessary rewrite is likely to be
> substantial.
>

As a demonstration project, getting a slimmed-down plan9 kernel to boot on
a PDP-11/70-class machine would be a nifty hack, but it would be quite a
tour de force and most likely the result would not be generally useful. I
think that, as has been suggested, the conceptual simplicity of plan9
paradoxically means that resource utilization is higher than it might
otherwise be on either a more elaborate OR more constrained system (such as
one targeting e.g. the PDP-11). When you can afford not to care about a few
bytes here or a couple of cycles there and you're not obsessed with
scraping out the very last drop of performance, you can employ a simpler
(some might say 'naive') algorithm or data structure.

I'm not sure how the kernel design has changed since the first release. The
> earliest version I have is the release I bought through Harcourt Brace back
> in 1995. But I won't be home till December so it will be a while before I
> can look at it, and probably won't have time to experiment before then in
> any case.
>

The kernel evolved substantially over its life; something like doubling in
size. I remember vaguely having a discussion with Sape where he said he
felt it had grown bloated. That was probably close to 20 years ago now.

For what it is worth, I don't think the embarrassment of riches presented
> to programmers by current hardware has tended to produce more elegant
> designs. If more resources resulted in elegance, Windows would be a thing
> of beauty.  Perhaps Plan9 is an exception. It certainly excels in elegance
> and design simplicity, even if it does turn out to be more resource hungry
> than I imagined. I will admit that the evils of excessively constrained
> environments are generally worse in terms of coding elegance - especially
> when it leads to overlays and self modifying code.
>

plan9 is breathtakingly elegant, but this is in no small part because as a
research system it had the luxury of simply ignoring many thorny problems
that would have marred that beauty but that the developers chose not to
tackle. Some of these problems have non-trivial domain complexity and,
while "modern" systems are far too complex by far, that doesn't mean that
all solutions can be recast as elegantly simple pearls in the plan9 style.
Whether we like those problems or not, they exist and real-world solutions
have to at least attempt to deal with them (I'm looking at you, web x.0 for
x >= 2...but curse you you aren't alone).

PDP11's don't support virtual memory, so there doesn't seem any elegant way
> to overcome that fundamental limitation on size of a singe executable.
>

No, they do: there is paging hardware on the PDP-11 that's used for address
translation and memory protection (recall that PDP-11 kept the kernel at
the top of the address space, the per-process "user" structure is at a
fixed virtual address, and the system could trap a bus error and kill a
misbehaving user-space process). What they may not support is the sort of
trap handling that would let them recover from a page fault (though I
haven't looked) and in any case, the address space is too small to make
demand-paging with reclamation cost-effective.


> So I don't think it i would be worth a substantial rewrite to get it
> going. It is a shame that there don't seem to have been any more powerful
> machines with a comparably 

Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-08 Thread Digby R.S. Tarvin
Does anyone know what platform Plan9 was initially implemented on? My guess
is that there is no reason in principle that it could not fit comfortably
into the constraints of a PDP11/70, but if the initial implementation was
done targeting a machine with significantly more resources, it would be
easy to make design decisions that would be entirely incompatible.
Certainly Richard Millar's comment suggests that might be the case. If it
is heavily dependent on VM, then the necessary rewrite is likely to be
substantial.

I'm not sure how the kernel design has changed since the first release. The
earliest version I have is the release I bought through Harcourt Brace back
in 1995. But I won't be home till December so it will be a while before I
can look at it, and probably won't have time to experiment before then in
any case.

For what it is worth, I don't think the embarrassment of riches presented
to programmers by current hardware has tended to produce more elegant
designs. If more resources resulted in elegance, Windows would be a thing
of beauty.  Perhaps Plan9 is an exception. It certainly excels in elegance
and design simplicity, even if it does turn out to be more resource hungry
than I imagined. I will admit that the evils of excessively constrained
environments are generally worse in terms of coding elegance - especially
when it leads to overlays and self modifying code.

PDP11's don't support virtual memory, so there doesn't seem any elegant way
to overcome that fundamental limitation on size of a singe executable.  So
I don't think it i would be worth a substantial rewrite to get it going. It
is a shame that there don't seem to have been any more powerful machines
with a comparably elegant architecture and attractive front panel :)

It is sounding like Inferno is going to be the more practical option. I
believe gcc can still generate PDP-11 code, so it shouldn't be too hard to
try.

DigbyT

On Tue, 9 Oct 2018 at 04:53, hiro <23h...@gmail.com> wrote:

> i should have said could, not can :)
>
>


Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-08 Thread hiro
i should have said could, not can :)



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-08 Thread Charles Forsyth
Ideally, anyway.

On Mon, 8 Oct 2018 at 11:20, hiro <23h...@gmail.com> wrote:

> saving every bit of memory has costs in coding, the pressure wasn't as
> strong any more.
> the earned flexibility can be used for more elegant design.
>
>


Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-08 Thread Digby R.S. Tarvin
I quite agree - the PDP 11/70 was quite a high end 16 bit machine, but it
was the machine that I was talking about and the one I would most like to
revisit (although I wouldn't turn down an 11/40 if somebody offered me a
working one). I don't think I would contemplate putting Plan9 on a machine
with no MMU or a 64K physical memory limit.

My first reasonable multi-user, multi-tasking computer system (back in the
early 80s)  was home made 6809 machine with 6829 MMU and eventually 1MB of
ram, running OS-9/6809. It initially ran with 64K for programs and and the
rest of memory was a big ram disk - because what else could you do with
such a ridiculous amount of memory. It did pretty well at providing a
personal Unix like environment, although counldn't reproduce the fork()
semantics and there was no memory protection, and the memory contraints
meant always running the C compiler one pass at a time.. But we eventually
ported 'Level 2' OS-9 which could use a mapping ram/MMU, and with that I
had a quite robust multi-user system, with up to 64K available per process,
and 64K available for the kernel. I was able to get most Unix programs
running on it (except for a few with big tables that compiled to larger
than 64K) and no longer had to worry about exiting the editor before doing
a compile. Most of the core system utilities were written in assembly
language - so the equivalent of 'ls' for example, required no more than a
256 byte memory allocation. And all executables were loaded read-only and
re-entrant (shared text) which helped. The only real Achilles heal was the
6809 had no illegal instruction trapping, so executing data could
occasionally  result in an unrecoverable freeze..

I never liked the 68K version os OS-9 quite as much. Because of the larger
address space it used the MMU for protection only, with no address
translation - so the kernel was mapped into the same address space as the
user programs but just not accessible in user mode. It just didn't seem as
elegant.

Anyway, thats why I don't see 64K per process as necessarily being
inadequate for a lean operating system, although it would be easy enough to
write extravagant code that would not run in 64K, or a design that relied
on a large virtual address space - especially if you were used to relying
on virtual memory. I just don't know if how small Plan9 can go, and unless
someone has already explored those limits, I suppose rather than
speculating i'll just have to plan on a little experimentation when I get a
bit of spare time.

Regards,
Digby



On Mon, 8 Oct 2018 at 19:13, Nils M Holm  wrote:

> On 2018-10-08T15:29:02+1100, Digby R.S. Tarvin wrote:
> > A native Inferno port would certainly be a lot easier, but I think you
> > might be a bit pessimistic about would can fit into a 64K address space
> > machine. The 11/70 certainly managed to run a very respectable V7 Unix
> > supporting 20-30 simultaneous active users in its day, [...]
>
> The 11/70 was a completely different beast than, say, an 11/03.
> The 70 had a backplane with 22 address lines, a MMU, and up to
> 4M bytes of memory. So while its processes were limited to
> 64K+64K bytes, I would not consider it to be a typical 16-bit
> machine.
>
> --
> Nils M Holm  < n m h @ t 3 x . o r g >  www.t3x.org
>
>


Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-08 Thread Nils M Holm
On 2018-10-08T15:29:02+1100, Digby R.S. Tarvin wrote:
> A native Inferno port would certainly be a lot easier, but I think you
> might be a bit pessimistic about would can fit into a 64K address space
> machine. The 11/70 certainly managed to run a very respectable V7 Unix
> supporting 20-30 simultaneous active users in its day, [...]

The 11/70 was a completely different beast than, say, an 11/03.
The 70 had a backplane with 22 address lines, a MMU, and up to
4M bytes of memory. So while its processes were limited to
64K+64K bytes, I would not consider it to be a typical 16-bit
machine.

-- 
Nils M Holm  < n m h @ t 3 x . o r g >  www.t3x.org



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-08 Thread Nils M Holm
On 2018-10-08T05:38:07+0200, Lucio De Re wrote:
> You really must be thinking of Inferno, native, running in a host with
> 1MiB of memory. 64KiB isn't enough for anything other than maybe CPM.
> Even MPM won't cut it, I don't think.

There were serveral UNIX 6th Edition-based "Mini Unix" variants
for the PDP-11/03 and other 16-bit systems. Then there is UZI,
the Unix Z80 Implementation, which can run multiple processes
(with swapping) in 64K bytes of RAM. CP/M ran in much less than
64KB.

-- 
Nils M Holm  < n m h @ t 3 x . o r g >  www.t3x.org



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-08 Thread hiro
saving every bit of memory has costs in coding, the pressure wasn't as
strong any more.
the earned flexibility can be used for more elegant design.



Re: [9fans] PDP11 (Was: Re: what heavy negativity!)

2018-10-07 Thread Digby R.S. Tarvin
A native Inferno port would certainly be a lot easier, but I think you
might be a bit pessimistic about would can fit into a 64K address space
machine. The 11/70 certainly managed to run a very respectable V7 Unix
supporting 20-30 simultaneous active users in its day, and I wouldn't have
thought plan 9  arriving about a decade later, would have been hugely
bigger than V7 Unix.
I recall a demo of Plan9 (I think it also included the source) being given
by Rob Pike at UNSW which he carried on a 1.44Mb floppy disc. By its open
source release in 2002 the distribution was 65MB

The smallest Linux system I have used recently had 256K RAM and 512K flash.
A rather stripped down busybox based system, but it did include a full
TCP/IP stack and a web server. Thats comparable to a PDP11 except for the
limitation on the largest individual process.

Bear in mind that 16 bit executables are smaller, and whilst the 11/70 had
a 64Kb address space, physical memory could be somewhat larger, and an
individual process could have 128K of memory is using separate instruction
and data space.

I am used to thinking of Plan9 as very compact, but I havn't really looked
to see if it has grown much since the 80s, and perhaps it is only next to
the astronomical expansion of other systems that it still looks small. It
would be an interesting exercise to find out.

It would be an interesting thing to try, if only to get a better feel for
how compact Plan9 actually is ...

DigbyT

On Mon, 8 Oct 2018 at 14:38, Lucio De Re  wrote:

> On 10/8/18, Digby R.S. Tarvin  wrote:
> >
> > So the question is... is plan9 still lean and mean enough to fit onto a
> > machine with a 64K address space? Doing a port would certainly provide
> > plenty of opportunity to tinker with the lights and switches on front
> > panel, and if it the port was initially limited to being a CPU server,
> > there would be no need to worry about displays and mass storage just
> > the compiler back end and low level kernel support.
> >
> You really must be thinking of Inferno, native, running in a host with
> 1MiB of memory. 64KiB isn't enough for anything other than maybe CPM.
> Even MPM won't cut it, I don't think.
>
> Lucio.
>