[plex86] OT: neural networks and dissassembly

Fraxinus Wed, 23 May 2001 14:15:41 -0700
I don't understand much of this, but I get the feel that this is very
revolutionary and cool... :)


--- Hugo Ahlenius
       [EMAIL PROTECTED]

----- Original Message -----
From: "Willow Schlanger" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, May 23, 2001 19:25
Subject: Re: [plex86] [no subject]


| Note: this is off-topic.
|
| Know what would _really_ be cool? Running a neural network that analyses
| the port I/O to the SVGA/chipset. It could devise a seeries of
| instructions (you know, logic gates connected to eachother) that would
| mimmic the hardware enough to boot up anyways. Of course it would always
| encounter flaws (what if you do something that the neural network wasn't
| trained for?) but IMHO it would be good enough, since you'd always
| _know_ when you encountered something new, if you worked at it.
|
| This would be _no_ good whatsoever for creating a program to emulate an
| actual chipset (like an SVGA chipset) but it would be good enough to
| reduce bandwidth.
|
| Too bad I don't have an oscilliscope. I could feed a bunch of wires into
| an oscilliscope and use that as input to the neural network. Then it
| could actually devise software that duplicates the functionality of the
| chips on the board. The NN would have ins and outs that go through
| "gates" that emulate all the logic that is obvious and it would figure
| out the rest by guess & check and comparing it to tables of I/Os of what
| has already been done and compare the output predictions it makes to
| actual outputs in the past and future. Once the prediction rate is 95%,
| you'd do this:
|
| While the NN's connections were being made you'd keep track of things
| and then when it's done and scores good enough you'd write a program
| that fed in all possibile inputs to get all possibile outputs, _except_
| where the in's/out's appear to be linear. In those cases you'd verify
| that they are in fact linear by analysing (somehow????) the neural
| network's connections and then write that to a text file.
|
| The result: you'd have a text file giving you all the documentation you
| need to write a program that emulates all the hardware on an SVGA card.
|
| Nifty idea, eh? Too bad I don't have an oscilliscope....
|
| BTW I'm trying to make my program that walks through code smarter. It's
| really pissing me off. If someone does this:
|
| sub sp,2
| PUSH (offset)
| ret
|
| the stupid thing doesn't know that the RET will cause eIP to go to
| (offset) because the sub sp,2 confuses it.
|
| Then I'm left with a dead end.
|
| Ever written a minesweeper game? The traversing you do is similiar to
| what my program does. It just walks through things until it would cross
| its tracks twice and then it backs up to the last "branch" (point of
| inflection).
|
| Maybe you guys would like to take a crack at it? I can upload the source
| if anyone's interested.
|
| Here's what I want to do: make the plex86 BIOS and its I/O devices 100%
| compatible with an existing computer. Even the undocumented things.
| Right now the BIOS is _not_ 100% compatible. But it's a great start. Its
| code is substantially different from any other BIOSes I've disassembled,
| which means that its legal (which is a good thing).
|
| How can we reverse engineer chipsets etc. legally, so that they can be
| used with plex86? Plex86's license agreement is LGPL which means that we
| can't disassemble stuff for 'inspiration'. But can we write programs
| that watch what I/O accesses happen during the normal operation of a
| program and have that program draw conclusions? (Have a program that
| takes notes like: port 123 returns 55 unless bit 3 of port 234 is set in
| which case it returns AA; only really it would all be done in tables
| that would first be made along with a number specifying the 'sequence'
| then generalized and simplefied (discard seemingly irrelivant things)
| and then print out to a text file).
|
| Anyone have any idea if that would be legal? It'd be so cool if we could
| have a 100% compatible virtualized PC. I'm leaving it up to you guys to
| make FPU and the CPU 100% compatible, but what about the BIOS and I/O
| devices?
|
| BTW a 100% compatible PC could accelerate growth of the WINE projects
| because the techniques developed could also be used to systematically
| create tables needed to make device drivers. Just run Windows in the VM
| with a certain device driver, connect Plex86 over a parallel or serial
| port to a "guest" or "slave" PC, and then use the VM under normal
| operation.) Then, after a while, shutdown windows. Then simplefy the
| data tables and output to a text file and.... if we can do that... then
| we'd be able to emulate that SVGA or or whatever chipset.
|
| Thus Linux could "steal" Windows drivers by creating a
| functionality-only text file. We'd then have to weed through everything
| to take out "protection" things. Some device drivers send signitures to
| unused bits or write to dummy I/O ports. Like maybe they'll write
| 12345678h to a port that'll be ignroed. We'd have to take those out.
|
| Maybe Corel provides Clean-Room services for free projects? heheheh.
| just kidding.
|
| Anyways, why stop at device drivers? Map out the Windows modules so we
| have entrypoints to subroutines along with all the outputs that will be
| used (we'll know because we've already called the things that call the
| subroutine and seen what outs it does) along with the inputs.
|
| Call the mapped subroutine with random inputs and look at the outputs. A
| neural network might not be the correct solution for this, but somehow
| correlate inputs and outputs.
|
| This would all be done inside of plex86.
|
| Thus if a procedure sent 20h to port 20h, we'd call a procedure and
| notice that the end result. This would be done on the
| _highest_level_possible_. Procedures that do not call other procedures
| and do not fall thru to other procedures, I call, Axiom procedures. They
| are the simplest, smallest, and most specific procedures and are, except
| for private ones that should be inlined but weren't because someone
| didn't want to, ... anyways these are called the most since they are the
| most general.
|
| Whenever a procedure jumps to another or falls thru to it, pretend it
| calls that procedure and then returns.
|
| The larger procedures are the ones we're after. We want to reverse
| engineer as little code as possible.
|
| We'll always know the input and output of every procedure we analyse
| because we started at the HIGHEST level and reverse engineer'd that and
| noticed points of inflections (including calls to other procedures that
| in turn did something like I/O or whatever that could be trapped,
| reasonably, and thus must be duplicated; correlated with memory
| changes).
|
| IMHO, what I am describing is a reverse engineering technique that was
| used by Award and AMI. It's evident from comparing their code
| (disassemblies). That is to say that what I am describing is legal. It's
| just that not everyone is interested in what you discover because they'd
| then need to learn _exactly_ how _you_ know it, or else they'd have to
| try to pretend not to have heard from you.
|
| I can't trace thru an Award BIOS during bootup because it does a
| hardware reset every time you do a software reset. Thus I've switched
| over to reverse engineering an AMI 486 BIOS.
|
| If you guys are interested, I can give you the results of what I do
| after I get some useful information. IMHO you could use it to improve
| your BIOS and make it 100% compatible.
|
| As an example, I _could_ do this:
|
| set SS:SP to xxxx:0000 in real mode. The CPU will treat this like
| SP=10000. Thus if I generate an interrupt SP will fall down 6 bytes to
| xxxx:FFFA.
|
| Now I'd write 55's to the xxxx:0000..ffff.
|
| Then I'd generate an interrupt, say INT 12h on my AMI BIOS.
|
| Then when it returns I'd compare xxxx:0000 to 55's.
|
| then I'd repeat, using AA's instead of 55's. And again for 00's and
| FF's.
|
| Then I'd see which parts of the stack were NEVER written to. Then I'd
| know how much stack space it would be safe for us to use.
|
| Plex86's BIOS uses a lot more stack space then AMI, AWARD, etc.
|
| All the INT 8 handlers on the various BIOSes FIRST clear ints and then
| send EOI to the pic and then return. Know why? Because before doing that
| they generated INT 1C. So what if INT 1C took too long and another IRQ
| is pending? Believe it or not, all BIOSes do an STI in their INT 8
| handler. I could know this by fiddling with the IRQ priorities and
| making the CMOS alarm go off, and then doing INT 8 and hooking the CMOS
| alarm IRQ handler. I'd then print out the fact that the CMOS went off,
| before the floppy motor was updated or even the timer ticks past
| midnight were updated.
|
| BUt really I know that because I disassembled a lot of BIOSes and saw
| that they all do STI before doing INT 1c.
|
| Anyways, what if INT 1C took too long so that another IRQ 0 is pending?
|
| In that case, if the INT 8 handler were to send an EOI, _immediately_
| after that another INT 8 would go off. Thus we'd have two int 8
| procedures, taking up extra stack space. It could even be an endless
| loop and cause a stack overflow which does not generate an exception in
| regular real mode (belive it or not).
|
| That's why they all do a CLI before doing an EOI.
|
| So those are things that one _could_ learn by calling the procedures
| after mapping them out and trying various I/O conditions.
|
| Now what if we write a program that tells us what tests to do? Thus we'd
| write a program that would map out the AMI BIOS and consider possible
| conditions, crossing its own tracks as many times as it has to, until a
| linear sequence is met (thus what if there's a loop? It'd go thru the
| loop until it figured out, "hey it's gunna do this 65536 times do I'm
| gunna stop tracing thru it since there's nothing new"; also if there's
| an endless loop, however complex, unless it relies on input from a
| reserved I/O port, then, it would stop tracing its tracks. If it did
| rely on input from a reserved I/O port bit then that fact would be noted
| and it'd backtrack).
|
| Then that program, when done, would provide a human with a list of
| inputs and outputs that it should try.
|
| Thus it would say, in effect, "change the IRQ priorities around to all
| possibilities, letting the CMOS IRQ be pending, and then generate an int
| 8". We'd have all INTs hooked except the one being analysed.
|
| Then the program would say, "now call f000:fea5 (the IRQ 8 handler)...".
| We'd do what it says and notice that the CMOS interrupt goes off before
| the timer gets updated.
|
| Is this a more cost-effective way of reverse engineering? What if you
| had a computer program that was "on the outside of the clean-room" and
| _you_ (couldn't be me, hehe...) were the "inside the cleanroom"? I don't
| see any reason why, legally, this would be any different. Thus one
| could, say, clone Windows, without violating its license agreement. One
| would do this like so:
|
| MS allows NetMeeting to be used with Windows. So have someone in an
| independant company use NetMeeting to give you access to Windows. You
| would not have singed the windows license agreement! Thus you can use
| the above method (right???) to reverse engineer it, looking at the
| Inputs and Outputs. You'd be on the outside of the cleanroom and a
| program, on the inside. Then you'd develop something similiar based on
| functionality of the high level functions. Interanlly, what you've done
| is provided a skelliton, 'converter' between various procedures and a
| "host" OS. That OS could be Linux.
|
| Thus what you've done would be for 'interoperability only'. If desired,
| you could have another person make a native OS that could be connected
| to the skelliton converter.
|
| And behold! A windows clone.
|
| So am I just rambling or is anyone else here interested in this shit?
|
| -WS
|
| Drew Northup wrote:
| >
| > Sounds cool...., but a SCSI device would only complicate things--despite
| > the _massive_ bandwidth increase available.  It would be good as a later
| > addition to have the SCSI available as a communications means--and
possibly
| > good old ethernet also.  This sounds a lot like the plex86-RFB &
bochs-RFB
| > projects that Donald Becker (aka "X-Odus") started on a while ago.  I'd
| > like to hear how this works out.
| >
| > Michael Wood wrote:
| >
| > > On Wed, May 23, 2001 at 02:02:47PM +0400, Sintsov Dmitri wrote:
| > > >
| > > >
| > > > On Tue, 22 May 2001, Willow Schlanger wrote:
| > > >
| > > > > If anyone's insterested, here's something I'm working on:
| > > > > I'm writing a miniture OS that communicates with a host
| > > > > computer via the parallel-port. The host computer has the
| > > [snip]
| > > > Cool idea for debugging but the SVGA's MMIO throuth the LPT
| > > > cable wouldn't run fast (needs massive bandwitch).
| > >
| > > hmmm... maybe a SCSI cable would be better? :)
| > >
| > > --
| > > Michael Wood        | Tel: +27 21 762 0276 |
http://www.kingsley.co.za/
| > > [EMAIL PROTECTED] | Fax: +27 21 761 9930 | Kingsley Technologies
|
|
[plex86] OT: neural networks and dissassembly

Reply via email to