On Mon, 2 Mar 2026, Chad Jablonski wrote:
I think these may not be the best constants as you still do a lot of
calculation with them so maybe it could be simplified but I don't know
how. If there's no visible difference in output with normal usage only in
corned cases that only happened during testing then maybe it's too early
to try to implement this and we could go back to the previous 128 bit
accumulator unless you found some real world usage where this would
matter. Otherwise I think this would only matter with a concurent 2D
engine where host data writes could continue in the other half of the 256
bit register while the 2D engine does the operation on the already written
half. So if we get the same graphical result with only storing 128 bits
for now we could do that and rething this when we can run the 2D engine
concurrently.


With well-behaved drivers there should be no difference. The only real
behavioral difference I've seen is when host_data blits are ended early.
I have tests covering this behavior now and it's documented so I have no problem
going back to the 128-bit implementation for now. If we do run into real
drivers that depend on this we can always easily resurrect it.

We only have a week left before the freeze for next release so I think we should finalize the series with the next version without adding much more complexity at the last minute so I think we should go back to 128 bit as was in previous version and add the simple fixes from the last version and use that for now and not start bigger rewrite now. This should work with Xorg and MorphOS and we can do more accurate emulation on top in the next devel cycle if it's found needed. If errors are found we can still fix that during the freeze but only if the series is merged by then. Will you make a version in the coming days or should I help and compile a series with the patches I think should work?

Regards,
BALATON Zoltan

Reply via email to