Re: [plex86] [no subject]

Willow Schlanger Wed, 23 May 2001 10:07:07 -0700
Note: this is off-topic.

Know what would _really_ be cool? Running a neural network that analyses
the port I/O to the SVGA/chipset. It could devise a seeries of
instructions (you know, logic gates connected to eachother) that would
mimmic the hardware enough to boot up anyways. Of course it would always
encounter flaws (what if you do something that the neural network wasn't
trained for?) but IMHO it would be good enough, since you'd always
_know_ when you encountered something new, if you worked at it.

This would be _no_ good whatsoever for creating a program to emulate an
actual chipset (like an SVGA chipset) but it would be good enough to
reduce bandwidth.

Too bad I don't have an oscilliscope. I could feed a bunch of wires into
an oscilliscope and use that as input to the neural network. Then it
could actually devise software that duplicates the functionality of the
chips on the board. The NN would have ins and outs that go through
"gates" that emulate all the logic that is obvious and it would figure
out the rest by guess & check and comparing it to tables of I/Os of what
has already been done and compare the output predictions it makes to
actual outputs in the past and future. Once the prediction rate is 95%,
you'd do this:

While the NN's connections were being made you'd keep track of things
and then when it's done and scores good enough you'd write a program
that fed in all possibile inputs to get all possibile outputs, _except_
where the in's/out's appear to be linear. In those cases you'd verify
that they are in fact linear by analysing (somehow????) the neural
network's connections and then write that to a text file.

The result: you'd have a text file giving you all the documentation you
need to write a program that emulates all the hardware on an SVGA card.

Nifty idea, eh? Too bad I don't have an oscilliscope....

BTW I'm trying to make my program that walks through code smarter. It's
really pissing me off. If someone does this:

sub sp,2
PUSH (offset)
ret

the stupid thing doesn't know that the RET will cause eIP to go to
(offset) because the sub sp,2 confuses it.

Then I'm left with a dead end.

Ever written a minesweeper game? The traversing you do is similiar to
what my program does. It just walks through things until it would cross
its tracks twice and then it backs up to the last "branch" (point of
inflection).

Maybe you guys would like to take a crack at it? I can upload the source
if anyone's interested.

Here's what I want to do: make the plex86 BIOS and its I/O devices 100%
compatible with an existing computer. Even the undocumented things.
Right now the BIOS is _not_ 100% compatible. But it's a great start. Its
code is substantially different from any other BIOSes I've disassembled,
which means that its legal (which is a good thing).

How can we reverse engineer chipsets etc. legally, so that they can be
used with plex86? Plex86's license agreement is LGPL which means that we
can't disassemble stuff for 'inspiration'. But can we write programs
that watch what I/O accesses happen during the normal operation of a
program and have that program draw conclusions? (Have a program that
takes notes like: port 123 returns 55 unless bit 3 of port 234 is set in
which case it returns AA; only really it would all be done in tables
that would first be made along with a number specifying the 'sequence'
then generalized and simplefied (discard seemingly irrelivant things)
and then print out to a text file).

Anyone have any idea if that would be legal? It'd be so cool if we could
have a 100% compatible virtualized PC. I'm leaving it up to you guys to
make FPU and the CPU 100% compatible, but what about the BIOS and I/O
devices?

BTW a 100% compatible PC could accelerate growth of the WINE projects
because the techniques developed could also be used to systematically
create tables needed to make device drivers. Just run Windows in the VM
with a certain device driver, connect Plex86 over a parallel or serial
port to a "guest" or "slave" PC, and then use the VM under normal
operation.) Then, after a while, shutdown windows. Then simplefy the
data tables and output to a text file and.... if we can do that... then
we'd be able to emulate that SVGA or or whatever chipset.

Thus Linux could "steal" Windows drivers by creating a
functionality-only text file. We'd then have to weed through everything
to take out "protection" things. Some device drivers send signitures to
unused bits or write to dummy I/O ports. Like maybe they'll write
12345678h to a port that'll be ignroed. We'd have to take those out.

Maybe Corel provides Clean-Room services for free projects? heheheh.
just kidding.

Anyways, why stop at device drivers? Map out the Windows modules so we
have entrypoints to subroutines along with all the outputs that will be
used (we'll know because we've already called the things that call the
subroutine and seen what outs it does) along with the inputs.

Call the mapped subroutine with random inputs and look at the outputs. A
neural network might not be the correct solution for this, but somehow
correlate inputs and outputs.

This would all be done inside of plex86.

Thus if a procedure sent 20h to port 20h, we'd call a procedure and
notice that the end result. This would be done on the
_highest_level_possible_. Procedures that do not call other procedures
and do not fall thru to other procedures, I call, Axiom procedures. They
are the simplest, smallest, and most specific procedures and are, except
for private ones that should be inlined but weren't because someone
didn't want to, ... anyways these are called the most since they are the
most general.

Whenever a procedure jumps to another or falls thru to it, pretend it
calls that procedure and then returns.

The larger procedures are the ones we're after. We want to reverse
engineer as little code as possible.

We'll always know the input and output of every procedure we analyse
because we started at the HIGHEST level and reverse engineer'd that and
noticed points of inflections (including calls to other procedures that
in turn did something like I/O or whatever that could be trapped,
reasonably, and thus must be duplicated; correlated with memory
changes).

IMHO, what I am describing is a reverse engineering technique that was
used by Award and AMI. It's evident from comparing their code
(disassemblies). That is to say that what I am describing is legal. It's
just that not everyone is interested in what you discover because they'd
then need to learn _exactly_ how _you_ know it, or else they'd have to
try to pretend not to have heard from you.

I can't trace thru an Award BIOS during bootup because it does a
hardware reset every time you do a software reset. Thus I've switched
over to reverse engineering an AMI 486 BIOS.

If you guys are interested, I can give you the results of what I do
after I get some useful information. IMHO you could use it to improve
your BIOS and make it 100% compatible.

As an example, I _could_ do this:

set SS:SP to xxxx:0000 in real mode. The CPU will treat this like
SP=10000. Thus if I generate an interrupt SP will fall down 6 bytes to
xxxx:FFFA.

Now I'd write 55's to the xxxx:0000..ffff.

Then I'd generate an interrupt, say INT 12h on my AMI BIOS.

Then when it returns I'd compare xxxx:0000 to 55's.

then I'd repeat, using AA's instead of 55's. And again for 00's and
FF's.

Then I'd see which parts of the stack were NEVER written to. Then I'd
know how much stack space it would be safe for us to use.

Plex86's BIOS uses a lot more stack space then AMI, AWARD, etc.

All the INT 8 handlers on the various BIOSes FIRST clear ints and then
send EOI to the pic and then return. Know why? Because before doing that
they generated INT 1C. So what if INT 1C took too long and another IRQ
is pending? Believe it or not, all BIOSes do an STI in their INT 8
handler. I could know this by fiddling with the IRQ priorities and
making the CMOS alarm go off, and then doing INT 8 and hooking the CMOS
alarm IRQ handler. I'd then print out the fact that the CMOS went off,
before the floppy motor was updated or even the timer ticks past
midnight were updated.

BUt really I know that because I disassembled a lot of BIOSes and saw
that they all do STI before doing INT 1c.

Anyways, what if INT 1C took too long so that another IRQ 0 is pending?

In that case, if the INT 8 handler were to send an EOI, _immediately_
after that another INT 8 would go off. Thus we'd have two int 8
procedures, taking up extra stack space. It could even be an endless
loop and cause a stack overflow which does not generate an exception in
regular real mode (belive it or not).

That's why they all do a CLI before doing an EOI.

So those are things that one _could_ learn by calling the procedures
after mapping them out and trying various I/O conditions.

Now what if we write a program that tells us what tests to do? Thus we'd
write a program that would map out the AMI BIOS and consider possible
conditions, crossing its own tracks as many times as it has to, until a
linear sequence is met (thus what if there's a loop? It'd go thru the
loop until it figured out, "hey it's gunna do this 65536 times do I'm
gunna stop tracing thru it since there's nothing new"; also if there's
an endless loop, however complex, unless it relies on input from a
reserved I/O port, then, it would stop tracing its tracks. If it did
rely on input from a reserved I/O port bit then that fact would be noted
and it'd backtrack).

Then that program, when done, would provide a human with a list of
inputs and outputs that it should try.

Thus it would say, in effect, "change the IRQ priorities around to all
possibilities, letting the CMOS IRQ be pending, and then generate an int
8". We'd have all INTs hooked except the one being analysed.

Then the program would say, "now call f000:fea5 (the IRQ 8 handler)...".
We'd do what it says and notice that the CMOS interrupt goes off before
the timer gets updated.

Is this a more cost-effective way of reverse engineering? What if you
had a computer program that was "on the outside of the clean-room" and
_you_ (couldn't be me, hehe...) were the "inside the cleanroom"? I don't
see any reason why, legally, this would be any different. Thus one
could, say, clone Windows, without violating its license agreement. One
would do this like so:

MS allows NetMeeting to be used with Windows. So have someone in an
independant company use NetMeeting to give you access to Windows. You
would not have singed the windows license agreement! Thus you can use
the above method (right???) to reverse engineer it, looking at the
Inputs and Outputs. You'd be on the outside of the cleanroom and a
program, on the inside. Then you'd develop something similiar based on
functionality of the high level functions. Interanlly, what you've done
is provided a skelliton, 'converter' between various procedures and a
"host" OS. That OS could be Linux.

Thus what you've done would be for 'interoperability only'. If desired,
you could have another person make a native OS that could be connected
to the skelliton converter.

And behold! A windows clone.

So am I just rambling or is anyone else here interested in this shit?

-WS

Drew Northup wrote:
> 
> Sounds cool...., but a SCSI device would only complicate things--despite
> the _massive_ bandwidth increase available.  It would be good as a later
> addition to have the SCSI available as a communications means--and possibly
> good old ethernet also.  This sounds a lot like the plex86-RFB & bochs-RFB
> projects that Donald Becker (aka "X-Odus") started on a while ago.  I'd
> like to hear how this works out.
> 
> Michael Wood wrote:
> 
> > On Wed, May 23, 2001 at 02:02:47PM +0400, Sintsov Dmitri wrote:
> > >
> > >
> > > On Tue, 22 May 2001, Willow Schlanger wrote:
> > >
> > > > If anyone's insterested, here's something I'm working on:
> > > > I'm writing a miniture OS that communicates with a host
> > > > computer via the parallel-port. The host computer has the
> > [snip]
> > > Cool idea for debugging but the SVGA's MMIO throuth the LPT
> > > cable wouldn't run fast (needs massive bandwitch).
> >
> > hmmm... maybe a SCSI cable would be better? :)
> >
> > --
> > Michael Wood        | Tel: +27 21 762 0276 | http://www.kingsley.co.za/
> > [EMAIL PROTECTED] | Fax: +27 21 761 9930 | Kingsley Technologies
Re: [plex86] [no subject]

Reply via email to