>
> I think these may not be the best constants as you still do a lot of 
> calculation with them so maybe it could be simplified but I don't know 
> how. If there's no visible difference in output with normal usage only in 
> corned cases that only happened during testing then maybe it's too early 
> to try to implement this and we could go back to the previous 128 bit 
> accumulator unless you found some real world usage where this would 
> matter. Otherwise I think this would only matter with a concurent 2D 
> engine where host data writes could continue in the other half of the 256 
> bit register while the 2D engine does the operation on the already written 
> half. So if we get the same graphical result with only storing 128 bits 
> for now we could do that and rething this when we can run the 2D engine 
> concurrently.
>

With well-behaved drivers there should be no difference. The only real
behavioral difference I've seen is when host_data blits are ended early.
I have tests covering this behavior now and it's documented so I have no problem
going back to the 128-bit implementation for now. If we do run into real
drivers that depend on this we can always easily resurrect it.

Reply via email to