> 1) new pixels = (old pixels & mask1) | (new pixels & mask2)
>
> Where mask1 and mask2 are the negated forms of each other.
> This even works for alpha masks:
>
> 2) new pixels = (old pixels "*" mask1) "+" (new pixels "*" mask2)

I understand what you're going for in principle, but I'm not sure what
magic you're trying to do. The computation for number 1 above doesn't
actually work. This just returns what new_pixels was except with the
alpha value of old_pixels.

So far the best way to do this I can think of is to do a multi-byte
alpha check, and then if any one of the bytes fails, go through one byte
at a time to find the offending one so that it is not written, and then
proceed.

I've been pulling my hair out all day today trying to come up with a way
to do this without byte-level checking.


Happy Hacking,

David E. McMackins II
Supporting Member, Electronic Frontier Foundation (#2296972)
Associate Member, Free Software Foundation (#12889)

www.mcmackins.org www.delwink.com
www.eff.org www.gnu.org www.fsf.org

On 07/19/2018 03:21 PM, Eric Auer wrote:
> 
> Hi David,
> 
>>> so basically you want to find out how fast you can
>>> update blocks of 64 000 pixels
>>
>> I never said that...
> 
> What I mean is that you changed ALL pixels to make
> sure to know what the frame rate in the WORST case
> is, when ALL 320 x 200 pixels actually need updates.
> 
>>> You probably want to optimize that copying routine.
> 
>> Not sure what that is.
> 
> Some people do really heavy tweaking to speed up games:
> 
> http://archive.gamedev.net/archive/reference/articles/article817.html
> 
> Note that this is sort of outdated, as most game people
> now worry about optimum use of 3d graphics chipsets.
> 
>>> In C, you can at least use logic calculations, to
>>> avoid having to do "if then" for every single pixel.
> 
>> Not sure what you mean here.
> 
> Roughly the following: If necessary, you first expand your
> transparency mask into a format which has one byte per pixel.
> 
> Then you negate the mask to get a one byte per pixel mask of
> the opposite of transparency. Then you compute something like
> 
> 1) new pixels = (old pixels & mask1) | (new pixels & mask2)
> 
> Where mask1 and mask2 are the negated forms of each other.
> This even works for alpha masks:
> 
> 2) new pixels = (old pixels "*" mask1) "+" (new pixels "*" mask2)
> 
> The trick is that you can do all operations using data
> types which are big enough for SEVERAL PIXELS. It means
> you can calculate the updated values for for example
> FOUR pixels in ONE step. For the alpha mask version,
> this requires the ability to treat a 32 bit value as
> a vector of four 8 bit values. This is exactly what
> MMX does: Like a floating point coprocessor which is
> specialized on floating point calculations, MMX is
> a CPU component which is specialized on vectors :-)
> So the "*" and "+" must work on "bytes in a longer
> data type". A normal 386 "add" or "mul" would fail.
> 
> Note that MMX uses 64 bit values and never stuff such
> as SSE uses even longer values, so you can do yet more
> pixels in parallel :-) The problem is that MMX, SSE and
> other things are often not well supported by compilers
> so you would have to manually write special code.
> 
> HOWEVER, the first (non-alpha) variant which only has
> yes / no decisions works with ALL COMPILERS which have
> a 32 bit integer data type :-) Of course it only gives
> you the expected speed when the compiler knows how to
> use 32 bit integers efficiently on 386 and newer CPU.
> 
>> Maybe I need to check again, but I'm pretty sure VGA RAM
>> is considered outside my allocated memory.
> 
> In DJGPP, you can request a mapping of the VGA RAM to
> a normal pointer. Then you can use it as if it would
> be part of your allocated memory. Using macros for a
> low level global memory peek or poke is much slower.
> 
> Here are some snippets from an old program of mine:
> 
> #include <dpmi.h> /* stuff with __dpmi_... names */
> #include <dos.h> /* int86, union REGS */
> #include <pc.h> /* things like inportb() */
> #include <go32.h> /* in case you want to access _dos_ds */
> #include <sys/farptr.h> /* e.g. _farpeekb(_dos_ds or other, offset) */
> 
> __dpmi_meminfo memory_mapping;
> int lfbSel;
> 
> memory_mapping.address = vesamode.lfbPTR; /* physical linear address */
> memory_mapping.size = ( (vesamode.bytes_line * vesamode.height)
>  + 65535) & (uint32)0xffff0000;    /* round up to multiple of 64k */
> 
> For VGA, you would just say address=0xa0000, size=0x10000, obviously.
> 
> __dpmi_physical_address_mapping(&memory_mapping); // fail if != 0
> __dpmi_lock_linear_region(&memory_mapping);
> 
> // for memory below 1 MB, this just made 1:1 mappings,
> // but you SHOULD use the LDT to stay more compatible:
> 
> lfbSel = __dpmi_allocate_ldt_descriptors(1); /* alloc 1 slot */
> __dpmi_set_segment_base_address(lfbSel, memory_mapping.address);
> __dpmi_set_segment_limit(lfbSel, memory_mapping.size - 1);
> 
> Now you can use _farpokeb(lfbSel, offset, value) for single
> bytes, _farpokew(...) for units of 16 bit and _farpokel(...)
> for units of 32 bit. You can also use other "far" stuff, but
> I have to admit that the example still is a bit tedious, using
> far pointers. Of course there is also _farpeekb(selector, offs)
> and _farpeekw(...) and _farpeekl(...) for reading the memory.
> 
> There also is good documentation about all this online :-)
> It sounds a bit complicated, but it is worth using in DJGPP.
> 
>> "Optimizing" by using a better video mode that *might not
>> be supported by the hardware* is not a real answer.
> 
> What I mean is: It is possible to use complicated VGA
> tricks to have multiple buffers and page flipping, but
> given how rare non-VESA hardware is, I would say it is
> sufficient to NOT try TOO hard to optimize for VGA and
> only "optimize a bit" for VGA. Because VESA is faster
> anyway and only a few users will suffer from slower
> performance when your program has to use VGA mode.
> 
> Regards, Eric
> 
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Freedos-devel mailing list
> Freedos-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/freedos-devel
> 

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel

Reply via email to