Re: [Pixman] Basic infrastructure for MIPS32r2 and DSPr1 optimizations, and patch for mips* asm exports symbols.
Nemanja Lukic writes: > This patch set should address all previous code review remarks. From looking at this patch set fairly briefly, - There is some confusion about the type of the new memcpy function. Does it return bool or void*? And what precisely does the return value mean? It seems to be used both as "this call succeeded" (boolean) and "copy of the destination pointer" (void *). In my opinion, the memcpy routines should take a pixman_implementation_t * as the first argument, and have no return value. This is similar to the other routines exported by implementations. - I'm not sure the exported pixman_memcpy() and the s/memcpy/pixman_memcpy/g sed job make sense. The new memcpy() infrastructure should primarily be considered a way to support the implementations that wish to provide a faster memcpy() for the various graphics operations, and the various ancillary uses of memcpy() don't really need to super optimized. Also, as mentioned in the last review: - The blt() function in pixman-mips32r2.c is not MIPS specific. It can live in pixman-general and rely on the normal fallback mechanism. And instead of calling the new pixman_memcpy() it could just call implementation->toplevel->memcpy(), which is the usual way to call a lower-level function in pixman. - The fill_buff pointers should be static and defined inside the C files that use them. It's not generally a good idea to define variables in a header file. Søren ___ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman
[Pixman] Basic infrastructure for MIPS32r2 and DSPr1 optimizations, and patch for mips* asm exports symbols.
Some of the optimizations introduced in previous DSPr2 commits were not DSPr2 specific. Some of the fast-paths didn't used DSPr2 instructions at all, and rather utilized more generic MIPS32r2 instruction set or previous version of DSP instruction set (DSPr1) for optimizations. Since Pixman's run-time CPU detection only added DSPr2 fast-paths on 74K MIPS cores, these optimizations couldn't be used on cores that don't support DSPr2, but do support MIPS32r2 or DSPr1 instructions (these are almost all newer MIPS CPU cores like 4K, 24K, 34K, 1004K, etc). These patches extract those MIPS32r2 and DSPr1 specific optimizations into new fast-path sets, with appropriate build and run time infrastructure. With these pathes no new optimizations are introduced, only refactoring of previous DSPr2 optimizations into MIPS32r2-specific and DSPr1-specific. Future commits will add more MIPS32r2 and DSPr1 specific fast paths. For testing, lowlevel-blt benchmark is used and two different MIPS cores: - 24Kc - for MIPS32r2 - 1004Kc - for DSPr1 This patch set should address all previous code review remarks. ___ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman
Re: [Pixman] Basic infrastructure for MIPS32r2 and DSPr1 optimizations.
Hi, Some highlevel comments on this patch set: * Most of the patches need more detail in the commit log. Generally, there should be around the level information that someone who is familiar with pixman could write the patch just based on the commit log. It's fine to say just "Add src_x888_ to lowlevel-blt-bench" or "Update author's email address" and leave it at that, but for example MIPS: dspr2: runtime detection extended or MIPS: dspr2: Removed build restrictions and repair compiler's check is not enough detail. The log should say why the existing code is not good enough, and what the new code is doing to fix it. And it's not enough to just say *what*, you also need to say *why*. So, MIPS: disabled non 32-bit platforms This patch add mechanism which allows optimizations to be run only on 32-bit platforms. needs an explanation of why optimizations should only be run on 32-bit platforms. * In the final code after applying all the patches, the files pixman-mips32r32.c, pixman-mips-dspr2.c, pixman-mips-dspr1.c contain a word-for-word identical copy of the same blt function. This is really totally unnecessary. If you look at pixman-implementation.c, you'll see that there is already a mechanism in place to fall back from higher level implementations to lower levels. Moreover, the blt function that has been duplicated isn't even MIPS specific -- the functionality in question (implementing blt in terms of an optimized memcpy() function) makes just as much sense for other architectures. So I think this should be done by - adding a memcpy() field to pixman_implementation_t - adding something like the duplicated blt() function to pixman-general.c (except having it call implementation->toplevel->memcpy() instead of a global variable). - having the various MIPS implementations set the implementation of memcpy() to whatever they like. This has a number of advantages: - Other architectures can implement their own optimized memcpy() if they wish - There is no ad hoc global MIPS specific virtual memcpy function - There is no blt function duplicated across all MIPS implementations * Something similar applies to the fill() routines: The fill in pixman-mips-dspr2.c doesn't use any DSPr2 specific functionality, so it should not exist. Instead just rely on the fallback mechanism that is already present in pixman-implementation.c. Similarly, the DSPr1 functionality should look like this: if (bpp == 16) { fill_buff16_dspr1 return TRUE; } return FALSE; Then if the system in question supports MIPS2r2 and bpp is not 16, it will automatically fall back to the fill in pixman-mips32r2.c I can't say I love the global pixman_fill_buff* function pointers, but if they are made static and moved into the file that actually uses them (after making the above changed, each one will only be used from one file), I probably won't complain too much. I will also write some specific comments to some of the patches. Søren Nemanja Lukic writes: > Some of the optimizations introduced in previous DSPr2 commits were not DSPr2 > specific. Some of the fast-paths didn't used DSPr2 instructions at all, and > rather utilized more generic MIPS32r2 instruction set or previous version of > DSP instruction set (DSPr1) for optimizations. > Since Pixman's run-time CPU detection only added DSPr2 fast-paths on 74K MIPS > cores, these optimizations couldn't be used on cores that don't support > DSPr2, but do support MIPS32r2 or DSPr1 instructions (these are almost all > newer MIPS CPU cores like 4K, 24K, 34K, 1004K, etc). > These patches extract those MIPS32r2 and DSPr1 specific optimizations into > new fast-path sets, with appropriate build and run time infrastructure. With > these pathes no new optimizations are introduced, only refactoring of > previous DSPr2 optimizations into MIPS32r2-specific and DSPr1-specific. > Future commits will add more MIPS32r2 and DSPr1 specific fast paths. > For testing, lowlevel-blt benchmark is used and two different MIPS cores: > - 24Kc - for MIPS32r2 > - 1004Kc - for DSPr1 > > This patch set should address all previous code review remarks. > > Any comments are available. > ___ > Pixman mailing list > Pixman@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/pixman ___ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman
[Pixman] Basic infrastructure for MIPS32r2 and DSPr1 optimizations.
Some of the optimizations introduced in previous DSPr2 commits were not DSPr2 specific. Some of the fast-paths didn't used DSPr2 instructions at all, and rather utilized more generic MIPS32r2 instruction set or previous version of DSP instruction set (DSPr1) for optimizations. Since Pixman's run-time CPU detection only added DSPr2 fast-paths on 74K MIPS cores, these optimizations couldn't be used on cores that don't support DSPr2, but do support MIPS32r2 or DSPr1 instructions (these are almost all newer MIPS CPU cores like 4K, 24K, 34K, 1004K, etc). These patches extract those MIPS32r2 and DSPr1 specific optimizations into new fast-path sets, with appropriate build and run time infrastructure. With these pathes no new optimizations are introduced, only refactoring of previous DSPr2 optimizations into MIPS32r2-specific and DSPr1-specific. Future commits will add more MIPS32r2 and DSPr1 specific fast paths. For testing, lowlevel-blt benchmark is used and two different MIPS cores: - 24Kc - for MIPS32r2 - 1004Kc - for DSPr1 This patch set should address all previous code review remarks. Any comments are available. ___ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman
[Pixman] Basic infrastructure for MIPS32r2 and DSPr1 optimizations.
Some of the optimizations introduced in previous DSPr2 commits were not DSPr2 specific. Some of the fast-paths didn't used DSPr2 instructions at all, and rather utilized more generic MIPS32r2 instruction set or previous version of DSP instruction set (DSPr1) for optimizations. Since Pixman's run-time CPU detection only added DSPr2 fast-paths on 74K MIPS cores, these optimizations couldn't be used on cores that don't support DSPr2, but do support MIPS32r2 or DSPr1 instructions (these are almost all newer MIPS CPU cores like 4K, 24K, 34K, 1004K, etc). These patches extract those MIPS32r2 and DSPr1 specific optimizations into new fast-path sets, with appropriate build and run time infrastructure. With these pathes no new optimizations are introduced, only refactoring of previous DSPr2 optimizations into MIPS32r2-specific and DSPr1-specific. Future commits will add more MIPS32r2 and DSPr1 specific fast paths. For testing, lowlevel-blt benchmark is used and two different MIPS cores: - 24Kc - for MIPS32r2 - 1004Kc - for DSPr1 Any comments for these pathes are welcome. ___ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman
[Pixman] Basic infrastructure for MIPS32r2 and DSPr1 optimizations.
Some of the optimizations introduced in previous DSPr2 commits were not DSPr2 specific. Some of the fast-paths didn't used DSPr2 instructions at all, and rather utilized more generic MIPS32r2 instruction set or previous version of DSP instruction set (DSPr1) for optimizations. Since Pixman's run-time CPU detection only added DSPr2 fast-paths on 74K MIPS cores, these optimizations couldn't be used on cores that don't support DSPr2, but do support MIPS32r2 or DSPr1 instructions (these are almost all newer MIPS CPU cores like 4K, 24K, 34K, 1004K, etc). These patches extract those MIPS32r2 and DSPr1 specific optimizations into new fast-path sets, with appropriate build and run time infrastructure. With these pathes no new optimizations are introduced, only refactoring of previous DSPr2 optimizations into MIPS32r2-specific and DSPr1-specific. Future commits will add more MIPS32r2 and DSPr1 specific fast paths. For testing, lowlevel-blt benchmark is used and two different MIPS cores: - 24Kc - for MIPS32r2 - 1004Kc - for DSPr1 Per previous code review: - Patch set splitted in more smaller patches Any comments for these pathes are welcome. ___ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman