Re: Probably the first published shell code example for Linux/390

Scott Courtney Mon, 04 Nov 2002 07:37:35 -0800

On Thursday 31 October 2002 03:03 pm, Greg Smith wrote:
> I understand that much.... but why did Intel want you to use a top-down
> stack ??  I remember from my Pascal days that you could reference your
> caller's local variables, so I guess it's easier to reference them in a
> top-down stack.  Just a guess, I have no idea.


>From a guy whose first exposure to computers, at all, was learning to hand-
assemble for the hot new Intel 8080A chip...

I think the thing people are overlooking in the question of "why a downward
stack?" is that there's a radical difference in how things worked back then,
in the microprocessor world.

Microprocessors ran single-tasking operating systems.

Every application owned the whole machine.

Security? Why would I need that when there's only one terminal and one paper
tape punch? Just lock up your tape reels. If you actually care that someone
looks at your program for tallying red, green, and blue widgets as they come
down their three chutes.

Mainframes have been multitasking for a lot longer than microcomputers. So I
think this crowd may be overlooking just how *primitive* things were in the
micro world when that stack design decision was made.

Back in the really early days of the Intel architecture, there was only 16
bit addressing of memory. There were no segment registers, no memory protection
at all, just one flat memory space of 65536 bytes maximum. The use of an
external stack was a significant innovation of the 8080 over the 8008 that
preceded it.

So, where did it make sense to put the stack?

Well, the program counter (PC) was reset to 0x0000 on power up or restart.
Until the 8085 there was no specialized use of low memory as interrupt
vectors, and even then, the 0x0000 address was still the initial PC
value. So applications, which expected to own the machine, were typically
assembled with "ORG 0" as their stated or implied location. You started
at address zero because it was simple to debug that way, because you could
guarantee that there had to be ROM there (else the 8080 wouldn't have any
initialization code), and because there was no good reason to complicate
things by starting elsewhere.

If your application was loading into RAM instead of ROM, you still thought
of memory as growing "upward from zero" as far as your application's static
storage and code was concerned.

The stack was a weird duck. In an era when lots of programs statically
allocated their variables and didn't even really use subroutine parameters
except in registers, the stack might very well be the *only* dynamically-
growing memory structure in the entire system. Really. Ask anyone who did
embedded systems work back then. You avoided dynamic memory structures if
you could because the programming and debugging tools were primitive and
because often the applications were simple enough not to need them. And
because dynamic memory structures needed to be managed by the application
or at least the compiler runtime, and that took CPU cycles. With a clock
speed of around 2 megahertz, and cycle times measured in microseconds,
this mattered a lot.

So, again we are back to, "Where do we put the stack?"

The best case from the coder's standpoint is that the stack, which was often
used only for saving register state and return address (actually part of
register state, because it was just a save of the next PC address), should
just be somewhere .... out of the way. Somewhere that it could be initialized
at power up and simply forgotten.

Since ROM had to start at 0x0000, it follows that RAM -- which, of course, is
where stacks have to live -- used the higher memory addresses. Often the
hardware designer would start RAM at either 0x8000 and go upward, leaving room
for ROM expansion, or at 0xffff going downward (same reasoning, different
way of expressing it). There were a couple of easy ways to make the RAM/ROM
divide easily in hardware. 0x8000 meant using A15 as a simple RAM/!ROM select.
Using 0x4000 meant using RAM=(A15+A14) and ROM=!(A15+A14). Using 0xc000 meant
RAM=(A15*A14) and ROM=!(A15*A14). If your ROM chip used only A11 through A0
(4K), so what? Just let it phantom through the whole ROM address space. If
you add a bigger one in the next design iteration, just route over the other
address pin (A12) and things sort themselves out automatically. Easy.

Again, remember that saying to a hardware designer, "If you do it this way,
you can eliminate one TTL logic package for address decode" --- MATTERED!
You probably had two spare NOR gates in another package somewhere, so if you
could decode just the upper two address lines, A15 and A14, you often could
in fact save the extra logic package for decode.

Intel's designers no doubt went through a lot of these thought processes as
they planned the 8080, and also as they used the existing 8008 processors
in real-world situations.

So they came up with the downward-growing stack. Put the stack pointer
initially at one address above the top of memory, because PUSH and POP used
predecrement/postincrement logic. If you were lucky enough to have a full
load of 64K of memory, you could set the SP to 0x0000 so that its first
byte stored would wrap around to 0xffff, the top of RAM. In any case, init
the SP at power up, don't use the top few words of RAM, and basically forget
about the stack.

Subroutine callbacks were shallow, typically. Few if any variables were
allocated on the stack, if you were coding in assembler, because it was
too cumbersome to manage. Variables were often statically allocated. That
meant the stack in most applications stayed very small, just a few dozen
bytes in some cases. If you had 4K or 8K or 16K (wow!) of RAM, you stuck
the SP at (top_of_ram + 1), let it grow down into RAM, and forgot about
it altogether. You just knew in your code design not to use the top hundred
bytes or so of RAM for static variables.

This was pretty loose-and-free with memory management, and I'm not saying
all programs were written that way. But when the 8080 first came out in
1973, microprocessor tools were pretty primitive and assembler was still
the best way to wring out the last drops of precious performance. And
people weren't running complex multithreaded, multitasking systems on the
8080. You had control of the application and the hardware environment, and
so if you were a fairly careful designer, you actually could get away with
things like this that would be unthinkable in a modern system.

By the end of the 8080/8085 series' lifespan, programmers were using other
languages more. PL/M, BASIC, FORTRAN, and C were available later on, and
the reasons for the original downward-growing stack faded away as the
programmer was relieved of the need to manage it. But the reasons why Intel
chose this approach are entirely sensible and understandable, at least to
me, because of the mindset of the microprocessor programmers of the day.

Incidentally, I noted in one other post a comment that there should be
hardware protection to keep from executing instructions off the stack.
Great idea. Starting with the 8086, Intel *had* this hardware protection,
because the stack had its own memory segment that could be physically
isolated from other segments. You had to explicitly set the code segment
register to point into the stack segment if you wanted to execute stack
data as code. The 8086 didn't have any security features to keep malicious
code from doing this, but at least it didn't happen by accident. Starting
with the 80286, there were actually security privilege levels that kept
application programs from tinkering with segment registers -- as long as
you were running in an operating system that supported this. DOS, of
course, didn't.

Then came our modern age, the age of flat memory models. Segment registers
are anachronistic. Toss them out. One simple, flat memory model is the only
way to go.

Hello simple memory management and portability. Bye-bye hardware stack
protection.

To be fair, I should mention that I generally *agree* with the flat memory
model. It really goes a long way toward improving portability, since there are
tons of hardware architectures that have no analogue to the Intel segment
registers. But in dumping the segment registers, we didn't get our lunch
totally for free. <GRIN>

    "The Intel segment registers are something of the trouble sort.
     The Sinclair crashes every time you implement a bubble sort.
     And those who bought a cray will find they haven't spent their money well.
     And need I even mention Nixdorf, Unisys, and Honeywell?"

     -- Steve Levine, "I've Built a Better Model Than the One at Data General"
     http://www.poppyfields.net/filks/00007.html
     (and, oddly, the version published here has slightly different words
     than the one I know, but I have a tape of it performed by the author
     with the above words)

Kind regards,

Scott

--
-----------------------------------------------------------------------------
Scott D. Courtney, Senior Engineer                     Sine Nomine Associates
[EMAIL PROTECTED]                           http://www.sinenomine.net/

Re: Probably the first published shell code example for Linux/390

Reply via email to