Andrew Collier wrote:
> 16* uncontended t-states for an IN a,(n) or OUT (n),a;
> 20** uts for an IN a,(c) or OUT (c),a.
>
> Question: Does SimCoupe currently use those values for the
> instruction time?

Not quite so fixed as the position in the scanline can vary the timings by 4
t-states.  I currently still have the raw timings for the instructions in
the code (i.e. 11 for IN A,(n), 12 for OUT (C),r and 16 for OUTI) but then
tweak the values as necessary by the display position and state.  At some
point I'll wrap the 'tstate' variable assignment macro to get the values to
be compiled to be 4 t-states rounded to save doing it at run-time.

The current implementation does 8 t-state instruction rounding for all parts
of the mode 1 display, or just for the centre 256x192 block of the screen in
the other modes when the screen is enabled.  The remaining situations use
the normal 4 t-state rounding.  I treat mode 2 the same as modes 3 and 4,
except the screen can't be disabled - am I right in thinking the timings are
the same.

For all the instructions that do port reads and writes there's an additional
4 t-states that are added for situations when the current scan position is
not already 8 t-state aligned, and only for ports >= 0xe0 (from Si Cooke).

Additional timings values I'm using that make a big difference for some
software are: IM0/IM1 time = 8 t-states, IM2 = 16 t-states (rounded),
interrupts active for 120 t-states and visible on the status port for 102
t-states (thanks for all those Ian!) and line interrupts are triggered  at
the start of the right-hand border area (line cycle position = 384-64 =
320).  I've removed all of the original timing tweaks that were in the code
for whatever reason as they only seemed to break things - I was hoping to do
without any but we'll see how it goes.  All these have help it survive
various small timing tests done by various people, including the now
infamous Defender loop ;-)

Now I've implemented the display changes (border, palette and/or video mode)
to instruction level it's possible to see how it copes with some of the SAM
demos that rely on perfect timing.  In general it seems to cope quite well,
but there still seem to be some subtle timing issues that means the
left-right positioning isn't quite right (not looked into yet).  Effects in
the border seem to run at the right speed, but ones on the main screen area
run a little too fast.  Here are some screenshots and problems (20-30K
each):

Mnemo demo 1, part 2 (http://www.obobo.demon.co.uk/mnemo1p2.jpg).  The
bottom border display stays synchronised correctly but it's left-right
position isn't correct.  The scroller on the main screen uses rapid VMPR
switching, but the diagonal pattern shows it's running too fast so it's not
lined up correctly, so some additional delay is needed.

Mnemo demo 2, part 2 (http://www.obobo.demon.co.uk/mnemo2p2.jpg).  Scroller
lined up ok, but the right hand edge has a strange stretching effect for the
edge of the character (and 'o' in this case) that's appearing.

Mnemo demo 2, part 3 (http://www.obobo.demon.co.uk/mnemo2p3.jpg).  Bars in
bottom border jump left and right by 1 to 2 blocks worth, and the
misalignment give additional lines that should probably be aligned with the
bars (?).  [it doesn't just run at 11fps as the title says, I'd just
unpaused it and it hadn't settled].

E-Tunes demo (http://www.obobo.demon.co.uk/e-tunes.jpg). VMPR switched
scroller now visible, tho the start position is slightly to the right and
it's also a couple of blocks too short.

Lyra 3 (http://www.obobo.demon.co.uk/lyra3.jpg).  Top scroller seems to run
fine, and bottom static image is shifted to the right.  The other visual
artifacts I used to see (where the screen should be disabled to hide stuff?)
are no longer visible :-)

The current SimCoupe doesn't seem to show enough of the border areas (mainly
top and bottom) to show all of the border effects, so it might be worth
having windowed modes show more of them (possibly optionally).  'ESI' in
Lyra 3 does look rather like 'FST' :-)


> Summary:
>         OUT (n),a   OUT (c),a   IN a,(n)    IN a,(c)
> VMPR     16 *        20 **       16 *        20 **

Those values fit the case when the scan position isn't 8 t-state aligned,
since the extra 4 t-states is being added to both (of course, there's no 8
t-state rounding to consider in your tests).  If you put a NOP before the
other instructions I'd expect you to get the same timing result, as the NOP
will add 4 t-states but there won't be an additional 4 t-states to add for
the alignment - would you please be able to try it?

I've noticed that some places where the video timing isn't quite right seems
to involve DJNZ for tight delay loops.  The width of the scroller section
used by the E-Tunes demo is mainly just one such loop.  Is there anything
special about DJNZ in terms of timing that could cause it to be too fast?

I'm also starting to wonder about instructions lying across the boundaries
where memory contention is introduced, as the subtle timings might make a
difference, and that'd be difficult to implement.  Another sub-instruction
thing I've wondered about is whether I need to worry about which part of the
instruction actually does the OUT that'll affect the video e.g. does it
occur before the end of an OUTI?  and if so, could it make a 1 block
difference in some cases?  (or am I just getting paranoid about timing?!
No, don't answer that...)

Comments and/or corrections on any of the above would be greatly
appreciated!

Best regards,

Si

ICQ: 9769343, Homepage: http://www.obobo.demon.co.uk/

P.S.  I've just realised that the OTd(R)/INd(R) instruction timings won't be
right since the extra 4 t-states is added calculated before the instruction
time is added to the LineCycleCounter value. I assume the instruction takes
12 t-states, which is correct for the other types of I/O instructions but
not for the block ones (which are 16).  I'll have to correct that and see if
it fixes any of the above tests...

Reply via email to