> 1) new pixels = (old pixels & mask1) | (new pixels & mask2) > > Where mask1 and mask2 are the negated forms of each other. > This even works for alpha masks: > > 2) new pixels = (old pixels "*" mask1) "+" (new pixels "*" mask2)
I understand what you're going for in principle, but I'm not sure what magic you're trying to do. The computation for number 1 above doesn't actually work. This just returns what new_pixels was except with the alpha value of old_pixels. So far the best way to do this I can think of is to do a multi-byte alpha check, and then if any one of the bytes fails, go through one byte at a time to find the offending one so that it is not written, and then proceed. I've been pulling my hair out all day today trying to come up with a way to do this without byte-level checking. Happy Hacking, David E. McMackins II Supporting Member, Electronic Frontier Foundation (#2296972) Associate Member, Free Software Foundation (#12889) www.mcmackins.org www.delwink.com www.eff.org www.gnu.org www.fsf.org On 07/19/2018 03:21 PM, Eric Auer wrote: > > Hi David, > >>> so basically you want to find out how fast you can >>> update blocks of 64 000 pixels >> >> I never said that... > > What I mean is that you changed ALL pixels to make > sure to know what the frame rate in the WORST case > is, when ALL 320 x 200 pixels actually need updates. > >>> You probably want to optimize that copying routine. > >> Not sure what that is. > > Some people do really heavy tweaking to speed up games: > > http://archive.gamedev.net/archive/reference/articles/article817.html > > Note that this is sort of outdated, as most game people > now worry about optimum use of 3d graphics chipsets. > >>> In C, you can at least use logic calculations, to >>> avoid having to do "if then" for every single pixel. > >> Not sure what you mean here. > > Roughly the following: If necessary, you first expand your > transparency mask into a format which has one byte per pixel. > > Then you negate the mask to get a one byte per pixel mask of > the opposite of transparency. Then you compute something like > > 1) new pixels = (old pixels & mask1) | (new pixels & mask2) > > Where mask1 and mask2 are the negated forms of each other. > This even works for alpha masks: > > 2) new pixels = (old pixels "*" mask1) "+" (new pixels "*" mask2) > > The trick is that you can do all operations using data > types which are big enough for SEVERAL PIXELS. It means > you can calculate the updated values for for example > FOUR pixels in ONE step. For the alpha mask version, > this requires the ability to treat a 32 bit value as > a vector of four 8 bit values. This is exactly what > MMX does: Like a floating point coprocessor which is > specialized on floating point calculations, MMX is > a CPU component which is specialized on vectors :-) > So the "*" and "+" must work on "bytes in a longer > data type". A normal 386 "add" or "mul" would fail. > > Note that MMX uses 64 bit values and never stuff such > as SSE uses even longer values, so you can do yet more > pixels in parallel :-) The problem is that MMX, SSE and > other things are often not well supported by compilers > so you would have to manually write special code. > > HOWEVER, the first (non-alpha) variant which only has > yes / no decisions works with ALL COMPILERS which have > a 32 bit integer data type :-) Of course it only gives > you the expected speed when the compiler knows how to > use 32 bit integers efficiently on 386 and newer CPU. > >> Maybe I need to check again, but I'm pretty sure VGA RAM >> is considered outside my allocated memory. > > In DJGPP, you can request a mapping of the VGA RAM to > a normal pointer. Then you can use it as if it would > be part of your allocated memory. Using macros for a > low level global memory peek or poke is much slower. > > Here are some snippets from an old program of mine: > > #include <dpmi.h> /* stuff with __dpmi_... names */ > #include <dos.h> /* int86, union REGS */ > #include <pc.h> /* things like inportb() */ > #include <go32.h> /* in case you want to access _dos_ds */ > #include <sys/farptr.h> /* e.g. _farpeekb(_dos_ds or other, offset) */ > > __dpmi_meminfo memory_mapping; > int lfbSel; > > memory_mapping.address = vesamode.lfbPTR; /* physical linear address */ > memory_mapping.size = ( (vesamode.bytes_line * vesamode.height) > + 65535) & (uint32)0xffff0000; /* round up to multiple of 64k */ > > For VGA, you would just say address=0xa0000, size=0x10000, obviously. > > __dpmi_physical_address_mapping(&memory_mapping); // fail if != 0 > __dpmi_lock_linear_region(&memory_mapping); > > // for memory below 1 MB, this just made 1:1 mappings, > // but you SHOULD use the LDT to stay more compatible: > > lfbSel = __dpmi_allocate_ldt_descriptors(1); /* alloc 1 slot */ > __dpmi_set_segment_base_address(lfbSel, memory_mapping.address); > __dpmi_set_segment_limit(lfbSel, memory_mapping.size - 1); > > Now you can use _farpokeb(lfbSel, offset, value) for single > bytes, _farpokew(...) for units of 16 bit and _farpokel(...) > for units of 32 bit. You can also use other "far" stuff, but > I have to admit that the example still is a bit tedious, using > far pointers. Of course there is also _farpeekb(selector, offs) > and _farpeekw(...) and _farpeekl(...) for reading the memory. > > There also is good documentation about all this online :-) > It sounds a bit complicated, but it is worth using in DJGPP. > >> "Optimizing" by using a better video mode that *might not >> be supported by the hardware* is not a real answer. > > What I mean is: It is possible to use complicated VGA > tricks to have multiple buffers and page flipping, but > given how rare non-VESA hardware is, I would say it is > sufficient to NOT try TOO hard to optimize for VGA and > only "optimize a bit" for VGA. Because VESA is faster > anyway and only a few users will suffer from slower > performance when your program has to use VGA mode. > > Regards, Eric > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Freedos-devel mailing list > Freedos-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/freedos-devel > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Freedos-devel mailing list Freedos-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-devel